Supporting SARS-COV2 related research

Message boards : News : Supporting SARS-COV2 related research
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · Next

AuthorMessage
capri

Send message
Joined: 17 Mar 20
Posts: 3
Credit: 2,266
RAC: 0
Message 290 - Posted: 8 Apr 2020, 16:39:14 UTC - in response to Message 274.  

Hello,

As there will be a constant flow of work for the next few weeks, if a host is done with processing a set of jobs, the host can always fetch additional work-units. There will be several thousand jobs submitted daily for the next few days.
We will watch the job turn around time with the current setting, and we can always adjust the settings for limiting the number of work-units on a computer at a given point in time. As of now, each host will be able to have 256 active tasks from the queue at any instance of time. If they finish processing for example128 tasks, they can always get additional 128 tasks from the queue. Any volunteer can continue requesting jobs from the queue once they finish processing jobs and stay below the limit.

Sincerely,
the BOINC@TACC development team


Just wondering if that limitation is working as expected. My oldish laptop downloaded about 700 tasks today and over 600 are still waiting to start. Thats enough work for the next days while other computers may be idle.
ID: 290 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
rituarora
Project administrator
Project developer
Project scientist

Send message
Joined: 4 Feb 19
Posts: 113
Credit: 0
RAC: 0
Message 291 - Posted: 8 Apr 2020, 17:13:38 UTC - in response to Message 290.  

Greetings - thanks for letting us know. We will revisit the settings again today.

Best Regards,
The BOINC@TACC Team

Hello,

As there will be a constant flow of work for the next few weeks, if a host is done with processing a set of jobs, the host can always fetch additional work-units. There will be several thousand jobs submitted daily for the next few days.
We will watch the job turn around time with the current setting, and we can always adjust the settings for limiting the number of work-units on a computer at a given point in time. As of now, each host will be able to have 256 active tasks from the queue at any instance of time. If they finish processing for example128 tasks, they can always get additional 128 tasks from the queue. Any volunteer can continue requesting jobs from the queue once they finish processing jobs and stay below the limit.

Sincerely,
the BOINC@TACC development team


Just wondering if that limitation is working as expected. My oldish laptop downloaded about 700 tasks today and over 600 are still waiting to start. Thats enough work for the next days while other computers may be idle.
ID: 291 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
rituarora
Project administrator
Project developer
Project scientist

Send message
Joined: 4 Feb 19
Posts: 113
Credit: 0
RAC: 0
Message 292 - Posted: 8 Apr 2020, 17:14:21 UTC - in response to Message 286.  

Thanks Greg for all the suggestions thus far and for your kind support!

Best Regards,
The BOINC@TACC Team

At the moment my queue for my machine is full. But I will bump my resource share up to 150% to get your project at the top of the list for new tasks.

Thanks, noted the issue and we have it on our priority list. We will post an update once we know the underlying cause. At least one volunteer has confirmed that the jobs are getting processed just fine on their end with VirtualBox 6.1.4.




Team Members,

Just got my first 4 tasks from you guys and 3 errored out and 1 I aborted another due to a technical problem.

If you want us to complete tasks properly, you should ensure that they run on VBOX 6.1.4 before you release them. If you can't make it work with 6.1.4 then I guess I will have to leave the project since all my other projects work fine on 6.1.4, which would be a shame.

I will go look for a technical board to continue this topic in its specifics.
ID: 292 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
rituarora
Project administrator
Project developer
Project scientist

Send message
Joined: 4 Feb 19
Posts: 113
Credit: 0
RAC: 0
Message 293 - Posted: 8 Apr 2020, 17:16:07 UTC

Dear All,

Here is a news article about the research on COVID-19 vaccine for which BOINC@TACC and other TACC systems are being used:

https://elpasoheraldpost.com/utep-school-of-pharmacy-developing-covid-19-vaccine-drug-treatments-using-supercomputing/

Thanks,
The BOINC@TACC Team
ID: 293 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Henk Haneveld

Send message
Joined: 17 Feb 19
Posts: 7
Credit: 8,682
RAC: 0
Message 294 - Posted: 8 Apr 2020, 18:19:06 UTC - in response to Message 291.  

Admin, when you do that you may want to reconsider the settings to use.
In an other post you gave a fixed number of max 256 results per host.
It would be beter to use a fixed number per core.
Example: 2 per core on a 128 core will also give 256 results but for a host with only 4 cores this wil be a max of just 8 results.
This will give a more even distribution based on the processing capacity of the host.

Henk

Greetings - thanks for letting us know. We will revisit the settings again today.

Best Regards,
The BOINC@TACC Team

Hello,

As there will be a constant flow of work for the next few weeks, if a host is done with processing a set of jobs, the host can always fetch additional work-units. There will be several thousand jobs submitted daily for the next few days.
We will watch the job turn around time with the current setting, and we can always adjust the settings for limiting the number of work-units on a computer at a given point in time. As of now, each host will be able to have 256 active tasks from the queue at any instance of time. If they finish processing for example128 tasks, they can always get additional 128 tasks from the queue. Any volunteer can continue requesting jobs from the queue once they finish processing jobs and stay below the limit.

Sincerely,
the BOINC@TACC development team


Just wondering if that limitation is working as expected. My oldish laptop downloaded about 700 tasks today and over 600 are still waiting to start. Thats enough work for the next days while other computers may be idle.
ID: 294 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
xRXT2toIGA
Project administrator
Project developer
Project scientist

Send message
Joined: 4 Feb 19
Posts: 61
Credit: 4,266
RAC: 0
Message 297 - Posted: 9 Apr 2020, 0:50:44 UTC

Hello,

Thank you to you all for bringing it to our attention. We have reduced the number of jobs run in a client as 5 per CPU core. i.e. : a processor with 4 cores can get maximum 20 jobs at the same time. There will be validation run for the same number of jobs. Hence the host would be running 40 jobs in this example.

Sincerely,
the BOINC@TACC development team
ID: 297 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Henk Haneveld

Send message
Joined: 17 Feb 19
Posts: 7
Credit: 8,682
RAC: 0
Message 307 - Posted: 9 Apr 2020, 18:34:37 UTC - in response to Message 297.  

Hello,

Thank you to you all for bringing it to our attention. We have reduced the number of jobs run in a client as 5 per CPU core. i.e. : a processor with 4 cores can get maximum 20 jobs at the same time. There will be validation run for the same number of jobs. Hence the host would be running 40 jobs in this example.

Sincerely,
the BOINC@TACC development team

The change to 5 per core looks to work nicely. What I don't understand is how you get the number of 40 jobs

What I do notice is that sometimes I get both the _0 en _1 version of the same job.
If you want solid validation they should be send to separate hosts and you should check your settings to fix that..
If validation is not that important then there is no point for sending out 2 versions of the same job
ID: 307 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
rituarora
Project administrator
Project developer
Project scientist

Send message
Joined: 4 Feb 19
Posts: 113
Credit: 0
RAC: 0
Message 316 - Posted: 10 Apr 2020, 9:04:20 UTC - in response to Message 307.  

Hello Henk,

It seems like with the current project settings, the validation is indeed running on the same hosts. We will look into adjusting these settings.

Thanks,
The BOINC@TACC Team

Hello,

Thank you to you all for bringing it to our attention. We have reduced the number of jobs run in a client as 5 per CPU core. i.e. : a processor with 4 cores can get maximum 20 jobs at the same time. There will be validation run for the same number of jobs. Hence the host would be running 40 jobs in this example.

Sincerely,
the BOINC@TACC development team

The change to 5 per core looks to work nicely. What I don't understand is how you get the number of 40 jobs

What I do notice is that sometimes I get both the _0 en _1 version of the same job.
If you want solid validation they should be send to separate hosts and you should check your settings to fix that..
If validation is not that important then there is no point for sending out 2 versions of the same job
ID: 316 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
fsu95OTBmn

Send message
Joined: 7 Mar 19
Posts: 15
Credit: 376,286
RAC: 0
Message 352 - Posted: 12 Apr 2020, 16:50:15 UTC - in response to Message 316.  

What is the current settings for maximum work per host? I'm not asking to have anything changed, only trying to understand the limits imposed. If it is 5 per core (thread), then my 128 thread system should get 640 as a maximum. It is only getting 320. Is there also a per host limit set?
ID: 352 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
xRXT2toIGA
Project administrator
Project developer
Project scientist

Send message
Joined: 4 Feb 19
Posts: 61
Credit: 4,266
RAC: 0
Message 355 - Posted: 13 Apr 2020, 2:41:48 UTC - in response to Message 352.  

Greetings, A 128 core system should receive 640 maximum task at a time. Are all the 128 threads hardware cores or is this the hyperthread number? As of now, we are setting 5 per core, so you should be receiving 640 total. However, if your computer uses hyperthreads or virtual cores, it is possible that you may be receiving lesser tasks.

Sincerely,
the BOINC@TACC development team
ID: 355 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
fsu95OTBmn

Send message
Joined: 7 Mar 19
Posts: 15
Credit: 376,286
RAC: 0
Message 356 - Posted: 13 Apr 2020, 3:41:08 UTC - in response to Message 355.  
Last modified: 13 Apr 2020, 3:43:46 UTC

Thanks for the response, This system has 64 hardware cores and 64 virtual cores (hyperthreading) that make up the 128 threads. If your system has detected only the 64 hardware cores, then the 320 is the correct number. However, I have 10 other systems that are a mix of hardware and virtual cores (hyperthreaded) and your system sends work units to those systems based on both combined as expected. Why doesn't the scheduler recognize the virtual cores on the EPYC system but does on the others?
ID: 356 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
xRXT2toIGA
Project administrator
Project developer
Project scientist

Send message
Joined: 4 Feb 19
Posts: 61
Credit: 4,266
RAC: 0
Message 359 - Posted: 14 Apr 2020, 2:11:02 UTC - in response to Message 356.  

Hello,

We have not tested the BOINC clients with hyper-threading enabled. We will post an update once we have tested the mapping of tasks to the hyper-threads. We will also check with other BOINC projects about this through the BOINC mailing list.

Sincerely,
the BOINC@TACC development team
ID: 359 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
rituarora
Project administrator
Project developer
Project scientist

Send message
Joined: 4 Feb 19
Posts: 113
Credit: 0
RAC: 0
Message 363 - Posted: 15 Apr 2020, 0:10:34 UTC
Last modified: 15 Apr 2020, 0:20:24 UTC

Greetings,

As of April 12, 2020, the researchers had completed screening more than 1124781 compounds through BOINC@TACC. There are many more results from the screening jobs completed on April 13 - April 14, 2020, and we will know their exact count after the post-processing scripts have run.

The researchers are in the process of preparing new datasets for the next round of screening compounds. Till this dataset is ready (hopefully in the next 2-3 days), the number of jobs running through BOINC@TACC will reduce.

Additionally, below is the summary of what we have found so far regarding the faulty work-units that were run on some hosts in the previous few batches of jobs:
1) In a particular batch of jobs, a file in the dataset got corrupted and the application failed to parse it, and hence produced error messages.
2) There were some input files (in the library of millions of files) that did not contain the correct parameters. Such input is hard to detect manually. The researchers have been informed about this and they would be writing scripts to check the quality of the input before submitting the next batch of jobs.
3) Virtual Box versions 6.1.4 and 6.1.2 are not compatible with the BOINC clients on some operating systems, and hence, the work-units may not get processed smoothly on these combinations of Virtual Box and operating systems.

Thanks for your continued support for BOINC@TACC!
The BOINC@TACC Team
ID: 363 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Elevator Startups

Send message
Joined: 16 Apr 20
Posts: 1
Credit: 0
RAC: 0
Message 366 - Posted: 16 Apr 2020, 12:14:53 UTC

Hi, could you tell us. Should we expect new wok units any time soon to fight pandemic? Hope to donate….
ID: 366 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
rituarora
Project administrator
Project developer
Project scientist

Send message
Joined: 4 Feb 19
Posts: 113
Credit: 0
RAC: 0
Message 368 - Posted: 16 Apr 2020, 18:15:25 UTC - in response to Message 365.  

Greetings,

We may not be able to test VB 6.1.2 and 6.1.4 with the different operating systems before the new work units are submitted by the researchers. We are under "shelter at home" order right now and do not have access to the additional hardware for installing and testing different operating systems and VB versions. However, if we are able to get access to bare metal nodes in the cloud and are able to succeed with the testing, we will post an update.

Thanks,
BOINC@TACC Team

Question:

Are you going to require or test the new tasks on 6.1.2 and 6.1.4 before they are released?
It would be nice if they ran on at least 6.1.2.
ID: 368 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
rituarora
Project administrator
Project developer
Project scientist

Send message
Joined: 4 Feb 19
Posts: 113
Credit: 0
RAC: 0
Message 369 - Posted: 16 Apr 2020, 18:15:54 UTC - in response to Message 366.  

Greetings,

Yes, new work units should be available soon. There are several million comparisons of compounds that still need to be done. Thanks for your kind support!

Thanks,
BOINC@TACC Team

Hi, could you tell us. Should we expect new wok units any time soon to fight pandemic? Hope to donate….
ID: 369 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
fsu95OTBmn

Send message
Joined: 7 Mar 19
Posts: 15
Credit: 376,286
RAC: 0
Message 370 - Posted: 16 Apr 2020, 21:47:50 UTC - in response to Message 355.  

Greetings, A 128 core system should receive 640 maximum task at a time. Are all the 128 threads hardware cores or is this the hyperthread number? As of now, we are setting 5 per core, so you should be receiving 640 total. However, if your computer uses hyperthreads or virtual cores, it is possible that you may be receiving lesser tasks.

Sincerely,
the BOINC@TACC development team


Is this parameter set on the on the server:

<max_ncpus>N</max_ncpus>
An upper bound on NCPUS (default: 64)


If so, could it be set to 128 or higher? There are servers shipping right now that have 256 thread capability. Since this is an upper bound parameter and the server is going to use whatever is reported from the client, why not set it to 512 and not have to re-visit in the near term. Thank you in advance for your consideration.[/b]
ID: 370 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Henk Haneveld

Send message
Joined: 17 Feb 19
Posts: 7
Credit: 8,682
RAC: 0
Message 373 - Posted: 17 Apr 2020, 6:56:32 UTC - in response to Message 371.  

I run version 5.2.38
ID: 373 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · Next

Message boards : News : Supporting SARS-COV2 related research