Hardware Optimisation & Benchmarking Shenanigans
AnsweredHi
As we all know taking as many good pictures as possible is the best way to ensure things go well, and ultimately will save lots of time back in the office.
However time is allways against us. As fast as Capturing Reality is, waiting is inevitably part of the game. And it can be a long time, days+ to only find out your photo shoot was not good enough.
hardware requirements are listed here https://support.capturingreality.com/hc/en-us/articles/115001524071-OS-and-hardware-requirements however it is very vague. From experience, with rendering & video encoding etc, a badly configured system which looks great on paper can perform 1/2 as fast as a similarly priced system with carefully selected components and optimised appropriately. Throwing more money at the problem is not always the answer. And at times can slow things down.
Various stages of the calculations, stress different parts of the system. However to what amount I am struggling to figure out. how can I/we optimise a system that will perform the best with the software.
I have recently got rid of my dual Xeon v3 28 core workstation, which for rendering was awesome, however in reality capture it was painfully slow. A much higher clocked, new architecture consumer skylake system is not hugely different in reality capture yes a little slower, yet 4x+ slower for rendering (cinebench). cost 5x+ less and has 4 vs 28 cores.
Below are the areas which I know can make a difference. Unfortunately as with many things we cant have our cake and eat it. Cost has a big influence and also technological system restrictions mean you can have 32gb of very fast ram, or 128gb+ of slow ram. You can have a 5.2ghz 6core cpu or a 3.6ghz 16core cpu.
- Cpu speed Mhz (IPC) - More is better always.
- Core count (threads) more is better - to a extent, and not at the cost of IPC. From my experience a dual cpu system worked awesomely with some applications, however the system architecture did not agree with other applications such as capturing reality and underperformed. I have a feeling that in the same manner as GPU's you get increasingly mitigating returns with reality capture when increasing core count, even if the cpus are maxed 100%.
-CPU - Instructions support (AVX etc) does reality capture take advantage ? or will it soon and to what extent ? I see you are looking for a avx coder. Has avx been enabled when the software is compiled or tested ?
I personally am wishing to build a new system, AMD offer good value CPU's at last with Threadripper and EPYC, however do not support AVX well at all. It would be a disaster to invest in the wrong architecture. I am aware that amd hardware does not perform ideally with other software due to lack of AVX. Is this/will this be true with capturing reality ?
-GPU count - 3 is max and as with most things, you get diminishing returns.
-GPU speed/cuda performance - 1080ti/titan/quadro etc are the goto cards with ti's being best bang for buck. The new Tesla V100's are compute monsters with a cost to match. Soon* we should have the consumer volta titans and gaming cards available.
-GPU memory, is 12gb enough ? Reality capture complains I do not have enough video memory frequently - maybe this is a bug, as my monitoring software says only around 1gb of memory is used.
-RAM amount - Reality capture is fantastic that in theory it doesn't require massive amounts like competitors, however it does have it's limits. - what impact does maxing the ram, and requiring swap file usage on performance have ?
I have encountered out of memory in reality capture many times, is throwing more ram at the system the best solution?
-RAM Speed, 2666mhz or 4400mhz ?
-RAM Latency - ties into the above, some apps love faster speed or tighter timings. ? from my experience, optimising cache and memory performance for cpu/ram can double the speed of certain applications. has this been tested ? - there sure is a lot of data being passed about.
-HDD for cache/virtual memory. latency vs speed. I expect this is less important, however every bit will count to a extent. I assume when ram limitations are hit this becomes more valuable.
From all the above it's easy to choose the best, but you can't you'l have to sacrifice one area to get the max performance in another.
So the solution
Benchmark datasets - I searched the forum and found others have mentioned the availability of a benchmark and even stated they will be creating one, however this was a year+ ago and nothing came of it.
Unless a integrated benchmarking tool is to appear in the software very soon (would be best) I propose to do the following.
Have 2 different datasets available to run to reflect varying workloads. (I can make some, - we could utilise data provided capturing reality, or maybe someone can suggest something suitable)
a) light dataset - will be fast
b) Heavy dataset - will take longer, however may give more accurate results.
Users will then shift start the application, and hit start. Theoretically everyone should be on the same level.
Users will be required to upload the contents of the logs created to either the forum thread, or ideally a google form I create.
The easy part - RealityCapture.log this is basically a duplicate of the console window and logs the timestamps for the various stages that complete. it should be located here: c:\Users\USER\AppData\Local\Temp\
It pumps out the following as a example.
RealityCapture 1.0.2.3008 Demo RC (c) Capturing Reality s.r.o.
Using 8 CPU cores
Added 83 images
Feature detection completed in 11 seconds
Finalizing 1 component
Reconstruction completed in 31.237 seconds
Processing part 1 / 5. Estimated 1225441 vertices
Processing part 3 / 5. Estimated 38117 vertices
Processing part 4 / 5. Estimated 926526 vertices
Processing part 5 / 5. Estimated 538277 vertices
Reconstruction in Normal Detail completed in 232.061 seconds
Coloring completed in 30.105 seconds
Coloring completed in 0.116 seconds
Coloring completed in 30.363 seconds
Creating Virtual Reality completed in 294.092 seconds
The trickier part- system analysis. There is a nice little freeware tool called hardwareinfo, that does not require installation. and can spit out a nice little text report as below, It contains no sensitive info. These two logs combined I believe will contain all the required information needed for us to compile a nice comparative dataset. When I say we I mean me, I'll have to parse the data onto a google spreadsheet which will do the calculations and we can all see the results.
CPU: Intel Core i7-6700K (Skylake-S, R0)
4000 MHz (40.00x100.0) @ 4498 MHz (45.00x100.0)
Motherboard: ASUS MAXIMUS VIII HERO
Chipset: Intel Z170 (Skylake PCH-H)
Memory: 32768 MBytes @ 1599 MHz, 16-18-18-36
Graphics: NVIDIA GeForce GTX 1080 Ti, 11264 MB GDDR5X SDRAM
Drive: Samsung SSD 850 EVO 500GB, 488.4 GB, Serial ATA 6Gb/s @ 6Gb/s
Sound: Intel Skylake PCH-H - High Definition Audio Controller
Sound: NVIDIA GP102 - High Definition Audio Controller
Network: Intel Ethernet Connection I219-V
OS: Microsoft Windows 10 Professional (x64) Build 15063.674 (RS2)
I'll need your help :)
A) input from my wall of text above.
B) Suggestions on the proposed benchmark & setup.
C) To run the benchmark and post the results.
If you've read through all that and think - "yeah, I'd spend 15 min running the test files and report back". Please say
If you've read part of it and fell asleep thinking , "aint nobody got time for that" - Please say :D.
What we get out of out all this?
Eventually when/if enough people with varying hardware post the results. We can determine what areas to spend our precious money on to improve the areas of capturing reality that we are bottlenecked in. Which components and configurations - help with say reconstruction or texturing the most, and what hardware is just ineffective.
What say you ? Do you think this is a worthwhile task and should I proceed ?
-
I guess installing the Demo next to the Promo is the same as next to the CLI, since both paid versions are the same code, just with different restrictions.
I'll open a new thread and ask out loud if it might be an issue.
-
Hey ivan,
I feel a bit bad that I haven't done anything yet but I'm really hesitant to change a running system.
I've just approached Michal again about how to make it more feasible for non-CLI users.
Have you thought about creating a new thread with a distinct title to get more people to participate?
-
Hi Gotz - Don't feel bad.
I've been a bit quiet over the last few days.. I've had my head buried down in the coding + regular life.
I thought a good solution was to install the steam demo, as that would allow you to run it alongside the promo. I have the code to check which version is installed and run the appropriate things.
When things are working better and more complete , I'll create a new thread. For the moment I have re-written a lot of the code to avoid a potential bug. I have version checking in place. I have mostly been working on the online side of things.
Soon I'll be uploading 0.2 :) -
Wow, you are really putting a lot of effort into it!
I can't thank you enough!
So I wait for the 0.2 version?
-
I'll upload it soon - I have various code branches trying different methods out etc.. :)
The website side is very messy and unfinished, - so it's more of a concept at this stage. However at first I'm just trying to get the basics working nicely then can improve layout etc from there.Syntax is my demon.. A ; or " here or there makes huge differences... and is testing me :D
I shall be using valve time. With that said. Soon ! :)
-
No rest for the wicked!
Alpha 0.4 Can be downloaded here
Now works with the steam Demo. (which is required if you have a promo license) - It''s free quick and easy to install alongside your existing installation.
I have tested the steam demo alongside the regular install of the application and is has no issues. - So the benchmark should run without any problem with your promo edition or regular license.Instructions:
1) unzip - (new code should not restrict to desktop as before)
2) Choose your images and place them inside the newly created /CapturingRealityBenchmark/Images folder
For now I'd suggest a smaller collection that you know work. - None are currently included.3) Run the benchmark.bat file
You will be asked to enter a identifier/nickname at the start4) Sit back and relax
5) Once the benchmark has run it's course you will be given the option to enter any additional notes.
Once the benchmark is complete the results should pop up.
Current Known Issues/Potential Issues
1) If dataset is too small or computer is too fast completing a section <15s It may not record the time stamp for that section, fix - increase amount of photos. - FIXED
2) I Cannot Identify if more than 1 GPU is present (requires Cuda toolkit) or we must wait until my workstation arrives so I can test multi GPU. - CURRENT
2.5) Same goes for Multi HDD's. - CURRENT
3) I run windows 10, I am unsure if all the commands/scripts will work on earlier versions/VM's/Servers - TESTED OK
4) The code is english as are the commands, I do not know if they work with other local's - CURRENT
5) Will likely only run with Demo & Full/CLI versions of the application. So if you have the Promo, please try installing the demo. - FIXED - Can install Steam demo alongside promo without issues.
6) The script assumes you have installed the application in the default directory CURRENT
7) Admin privileges maybe required. CURRENT
8) Be wary of running software from unknown sources from the internet. Both *.Bat files are in plain text. You are free to inspect the code in notepad to ensure no shenanigans. You can also check with www.virustotal.com
9) The project will delete your cache and may change some application settings away from default. - Fear not a backup of your settings are saved first. as "GlbBackup.bak.rcconfig"
10) You looked at the code, and question why it's such a mess, why I did it that way and why it took me so long, me too. - I'm no expert. CURRENT
11) If you have made a suggestion and I ignored or refuted it, sorry. - If you think it is important, try a different way to convince me, I may not have understood. This project is for the benefit for us all, my opinion is just one of many. Everyones input and suggestions are valued :)
12) Gremlins cause mischief if you feed them after midnight. Also if you allow the benchmark to run over midnight the timer can fail due to 24 hour clock reset. - I need to rework the way time is recorded to avoid this.Changelog:
Re-written code to avoid missing time stamps
Lots more Parsing
Added Check for Steam Demo, and run alternative script
Extracted Application version
Various logic added (probably considered not logical)Feedback is appreciated, Especially if it does not work.
-
Hello Ivan,
I followed the Yellow Brick Road, alas got this message in newresults:
BvC
1.0.2.3008 RC
ECHO is off.I had installed the Demo version of RC, not placed on my system drive, which is near full and didn't want to invite out of space issues. Please, advise. Thanks.
Benjy
EDIT: I suspected the .bat file was looking for the Demo version on the C drive, perhaps that's in your instructions, admittedly rushed in without reading everything. I got another hicup, suspected I needed to first run the Demo before the .bat, which called for me to sign in, so once past that and having demo running on C, it's now running. I'm glad I could get this going before adding RAM, will run again for comparison.
At this point are we simply using locally and feeding back if everything is working? Let us know when we should run with results automatically uploading to a central place. And thanks, Ivan, for all your efforts. I'm sure it's been a bit of a time suck for a scattered response, but know we're here. Many thanks.
Benjy
EDIT 2: My results report:
BvC
10:03:07.01
1.0.2.3009 Demo RC
97
316.763
617.171
60.887
410.802
10:29:07.27
NumberOfProcessors=1
Name=Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz
NumberOfCores=4
NumberOfLogicalProcessors=8
Name=NVIDIA GeForce GTX 980 Ti
Name=Intel(R) HD Graphics 4600
gpucount=1
TotalPhysicalMemory=8448188416
Speed=1333
Model=Seagate Backup+ Desk SCSI Disk Device
Size=5000978465280
Model=SAMSUNG 470 Series SSD
Size=256052966400
Model=Crucial_CT960M500SSD1
Size=960194511360
Model=SanDisk SDSSDX240GG25
Size=240054796800
Model=Seagate Backup+ Hub BK SCSI Disk Device
Size=6001172513280
Version=10.0.15063
PeakUsage=1202
Ran without event. -
Hi Benjamin
Thank you for the feedback, currently having the software installed in a custom directory is a known issue, no. 6) on the list.
The issue is that the variable of where the application/demo is installed is not exposed. I'll have a think, however it is not high on the list. As the coding required for such unknown variables takes time.
Due to the install size of the demo being only 46mb, It shouldn't be a huge issue.
-
Hello Ivan,
I'm curious why your list of results displays so differently from mine. Yours presents functions and the associated times:
RealityCapture 1.0.2.3008 Demo RC (c) Capturing Reality s.r.o.
Using 8 CPU cores
Added 83 imagesFeature detection completed in 11 seconds
Finalizing 1 component
Reconstruction completed in 31.237 seconds
Processing part 1 / 5. Estimated 1225441 vertices
Processing part 3 / 5. Estimated 38117 vertices
Processing part 4 / 5. Estimated 926526 vertices
Processing part 5 / 5. Estimated 538277 vertices
Reconstruction in Normal Detail completed in 232.061 seconds
Coloring completed in 30.105 seconds
Coloring completed in 0.116 seconds
Coloring completed in 30.363 seconds
Creating Virtual Reality completed in 294.092 secondsMine only has numbers:
BvC
10:03:07.01
1.0.2.3009 Demo RC
97
316.763
617.171
60.887
410.802
10:29:07.27
NumberOfProcessors=1
Name=Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz
NumberOfCores=4
NumberOfLogicalProcessors=8
Name=NVIDIA GeForce GTX 980 Ti
Name=Intel(R) HD Graphics 4600
gpucount=1
TotalPhysicalMemory=8448188416
Speed=1333
Model=Seagate Backup+ Desk SCSI Disk Device
Size=5000978465280
Model=SAMSUNG 470 Series SSD
Size=256052966400
Model=Crucial_CT960M500SSD1
Size=960194511360
Model=SanDisk SDSSDX240GG25
Size=240054796800
Model=Seagate Backup+ Hub BK SCSI Disk Device
Size=6001172513280
Version=10.0.15063
PeakUsage=1202
Ran without event.Thanks for clarifying.
Benjy
-
Thankyou Benjy
The results differ from the example shown originally on the first upload, the reason is for parsing. It is easier to deal with just a number as a attribute rather than a number plus a name, I will likely change this in the future so the txt file is clearer for the end user to read. I am just trying various things out, to see the effects.
I am currently (when I have bits of free time) working on the web part of the benchmark, which will give the user the option to upload the results at the end to compare. slightly similar to this. https://corona-renderer.com/benchmark/26 . However with more selection/comparison variables.My workstation has been delayed for 2+ weeks due to parts.. :/. So it is very useful to see how a system with multi GPU and multi hd's are outputted. Your results are useful as I can see some changes to the code need to be made.
I don''t believe we actually need all of the hdd info, just which drive is being used as the cache drive. So I need to figure that out.
for the moment the results you see are as follows. You could if you wish use it as a guide just for yourself if you change hardware or run the same project on another pc to see the difference.10:03:07.01 - benchmark start time
1.0.2.3009 Demo RC
97 - alignment time
316.763 - depthmap time (GPU accelerated)
617.171 - Model creation
60.887 - Simplify model
410.802 - Texture
10:29:07.27 - benchmark end time -
Most useful, can drop the categories into a column in a spreadsheet, then paste updated performance values into new rows per configuration modification or other's PCs.
Is there any way to track whenever virtual memory is invoked and how much time is chewed with those read/writes?
I'm also wondering about latency in memory, what if any benefit there is to low values. I've read where low latency doesn't add much if anything to frame rates for video games. Is low latency cost justified?
Lastly, I've been urging new users to participate, one of whom is building nodes used for crunching bitcoin data, bringing in large data sets into RC as needed. We both agree that whenever you get to the point where a shared set of images is established to standardize that variable among all participants, it's most useful if a third category is available beyond the small 30-image test, and the medium 500-image test you described. We need a large one right up to the 2500 limit in Promo. Not much is learned if image pixel density and count doesn't push system resources to the brink. The bottleneck(s) reveals the thing of value, and 500 images, even 42 MP, isn't much of a load on even a modest setup like mine. Having three image folders - small, medium, and large - can all come from the same scene, provides the user the choice what to bite off, no imposition. If this is the right idea, then it's probably also useful to code the ability to intervene with a save command, and then a resume. That further supports users trying out the 2500 option, let that run at night, user saves during the day to work in RC on paying projects or other apps requiring system resources, then jump back in at night with resume.
I know this is all phase II, just a thought.
-
Tracking things like VM usage over time is possible, however I have avoided such things, as the exported data gets messy. I may even remove the peak VM usage..
I cant actually retrieve the memory latency information. I can get the memory speed. It is a fair question and the exact reason why I have included it. From my own experience some applications it makes almost no difference whilst others it can double performance. - This is the way we can find out. latency tends to go up with with speed.I agree some thought needs to go into data sets, and the stress that is put on the systems, and that will require varying loads to get the desired results.
-
Hello Ivan,
I rebuilt my PC with the new motherboard, CPU and system memory, happy to see everything power up and launch, Windows 10 license now transfers automatically. I ran benchmark.bat and was happy to see the improvements. Here's a link to the results comparison using your formatting and adding a couple of useful formulas comparing the data.
One thing I missed the first go round but staring me right in the face was the first report on memory, only showing 8 GB. I tend to trust this, but am troubled by that fact, would be a major pain to rip the motherboard out and replace with the old one just to confirm the bad DIMM (before selling on ebay especially). I'm also puzzled how I could have functioned at all with just 8 GB RAM, as in Unreal I was loading 41 million polys into a scene, just so, but maintaining 70-90 fps. I realize that's mainly hitting GPU memory, but especially while building the project the load on memory is significant. Can't explain to myself how this ever worked with a bad DIMM.
Anyway, I've yet to turn on overclocking, will run again and update the spreadsheet.
I agree, logging any drive not accessed by RC is wasted pixels. Final question, I see PeakUsage changed from 1202 to 1. What function is meant by PeakUsage?
Thanks for your efforts, very happy to see what time it is on my machine.
-
Hi Benjy
Nice new system.. I am not enjoying the wait :D
I believe the memory report is correct as it is taken from a operating system interrogation command. Unless you had a weird server type motherboard with a specialist layout I would trust the result. It is displayed in bits, so divide by 1,073,741,824 to get the GB. (this is currently done automatically at upload on the latest version)
The virtual memory peak usage is the highest amount of virtual memory used that boot.
In theory it could be possible to log the virtual memory usage at time stamps when the benchmark is running, and then extrapolate the highest value from those logs, and use that instead, however that would be a project in it'self for me. So I wont :)It's great to see a nice improvement on calculation times. Congratulations a worthwhile investment. :)
-
Hello Ivan,
I don't understand how my PC's previous build could have functioned as well as it did limping along with just one of the two 8 GB DIMMs engaged, but that does appear to be the case. I'm recalling now several instances where either RC wasn't delivering, e.g. missing parts, and seeing a message flash up upon shutting down to the effect of "Memory location at 0000000000 is missing". I ran a memory test when I saw this and it checked out, so dismissed it as a mystery. I was concerned about either selling bad DIMMs or not being able to sell perfectly good ones, yet didn't want to rip out the new ITX motherboard to replace the old one for testing these DIMMs, not just to avoid the hassle, but not to reverse gears on the transferring of the Windows 10 license. I then thought to build an older ATX motherboard system compatible with DDR3, the memory checked out fine. That now pointed to the old ITX motherboard. A friend suggested he had seen similar errors as I described, suggested the fix as updating the drivers in BIOS, simplest remedy being to reset CMOS.
I'm left wondering what effect this limping memory issue had on my previous performance numbers. Clearly, the Peak usage difference shows lost time to hitting virtual memory, but how much? Is that all mainly in simplify and texturing? I was essentially quadrupling the memory, and those two operations showed roughly a doubling in performance in texture and 270% increase in simplify, so I tempted to extrapolate that had I had all 16 GB functioning normally, the increase in performance in those two areas would been more like 150% in texture and maybe 235% in simplify.
I question how much time others, not to discount myself, have to tease out the true performance characteristics of each system component per RC operation, not to forget how everything performs in combination. The path of least resistance is to look at an overall performance increase, which RC already logs, be happy AFTER making a purchase, or not. But, that's precisely what motivates your benchmark project, to pin down what components play nicely with others to comprise either the optimum build, or at least the optimum build for a particular budget. To that end we'll surely need real analytics applied to a statistically significant data set.
Another complicating factor, something exposed through running benchmark.bat with identical variables numerous times, the variability in performance that can only be attributed to RC. I've updated the spreadsheet (reposting the link), note the variance between the five iterations. We see a 20-32% difference in performance! Is this due to what Götz talks about with a certain randomness in the behavior of the algorithms? Regardless, my choice to run 5 iterations may or may not expose the true range of variability here, but would seem to represent a bare minimum to have much faith in the numbers, unless you're good with a +/- 15% margin of error.
I'm about to turn on overclocking and see what time it is, managed to squeeze water cooling above the CPU in that tiny ITX.
I hope others will find time to test and post results, that collectively we move toward testing a standard set of images and enable automatic upload to a central database as discussed.
Cheers,
Benjy
-
test
-
I just switched on a bit of overclocking, cpu ratio 45X, system was stable, max cpu speed was supposedly bumped from 3.6 to 4.0 GHz, but my performance with benchmark.bat dropped 4%. Anybody have any experience with OC? I thought even the simplest settings would return an improvement, but no.
Addendum -
I ran several more iterations graduating CPU ratio, seemed like the numbers climbed a bit from 40X starting point to 45X, 50X, peaking at 55X, then dropping again at 60X, though system never crashed. Even so, we’re still down in the noise in terms of overall time saved, 102% over the mean between the 5 iterations before OC. To keep in mind, whatever OC brings to the table with each change, we still have the +/- 15% fluctuation within RC using the same settings, so if anything, the one effect I’m tempted to take away from the second batch of iterations graduating cpu ratio is that the variability between each iteration is far less. The greatest performance boost I see was using some settings sent me by TS at ASRock using cpu ration of 44X alongside some other items, but just 103% faster than the mean of 5 runs w/o OC. I kept an eye on CPUZ, saw max cpu speed right at 4.4 GHz per core/thread, well above the 3.6 GHz the i7-7820X is rated at, but for such little return. Interesting that depth maps took longer than without OC.
Between the variability within RC per fixed settings and the fact that OC is nontrivial dialing in per machine and per OPERATION(S), I think it’s time to back off until more happens with the benchmark project and hopefully more participants against which to compare my stats to put more dots together.
Here, once more is a link to my stats. https://docs.google.com/spreadsheets/d/1YJ41Rrymsh3rVBESWREaUkW01q8Ekaqoe-HN-npH-zg/edit?usp=sharing
-
Hello Ivan (or anybody),
As you can see from my previous post, last addendum, I was about to hang it up with OC, the numbers not making any sense and not seeming to add much of anything anyway. Be it due to the effect of OC on my PC or my OCD, I didn't like leaving something so messy in that state, so returned to it once more, and glad I did. I flashed CMOS to get back to base level. I spoke to ASRock techs on the phone, was advised to update BIOS to a beta version. This tech also answered a question that was bugging me. Though my i7-7820X cpu is rated at 3.6 GHz, I couldn't understand why graduating cpu ratio from default 40X upwards wasn't crashing my system, but also wasn't returning improved performance using your benchmark.bat. I reset CMOS, was able to get back to baseline performance, but then saw CPU clock speed reported at roughly 4.0 GHz. I thought if max cpu core speed was set to 3.6, then how could this be? The ASRock tech said this was entirely normal, that I should count myself lucky to have a chip that outperformed its rating.
I didn't only re-establish baseline performance, I actually saw a 2-3% improvement relative to the mean across 5 iterations of benchmark.bat, so that was comforting. As stated previously, this variance across the first 5 iterations following my hardware upgrades with nothing changed between each run was as high as a whopping +/- 15%, depending on which operation we're talking about. For instance, alignment stayed rock steady at 26 secs each of the 5 runs, but model creation varied 32% between low high and simplify varied 41%! After CMOS reset and updating BIOS my overall performance not only returned to previous mean, but improved a couple percent as I suggested. What's also interesting at this point is that the variation between these next 5 iterations dropped, e.g. model creation high/low was only 19% apart and simplify down to 9% difference.
Still perplexed about all of this, I had the thought, might the fact that my first 5 tests were run immediately after installing the new hardware itself have an effect? If so, maybe the smoothing out of the values reflects my "new system getting broken in". That thought tempted me to try once more turning on OC. This time I used Intel's Extreme Tuning Utility (probably does the same thing as ASRock's, but provides nice graphical GUI and live stats, which I checked against CPU-Z and Task Manager/Performance). I took the ASRock tech's advice to heart, don't go past 44X cpu ratio, then ran benchmark.bat twice. I watched cpu clock speed rise to roughly 4.4 GHz and saw overall performance kick up to 189% and 190% for these two tests relative to pre-hardware upgrade. Thermal climbed as high as 80°C, so feel like I should be happy all around that this config will perform its best with the 44X OC. We'll just have to see how stable the system is when running much larger datasets, as the thermal wall may push me back to factory settings. Maybe, I should keep a bucket of liquid nitrogen handy ;^)
Anyway, glad to move on. All those other questions about an optimum build remain. I hope this project stays in motion, but if nothing else, I appreciate your providing the benchmark.bat, which really was the tool for the job in-house to see what time it is.
Best,
Benjy
-
Hi Benjy
Sorry for the slow response
I also will be getting the x299 platform and received a email that my parts are coming into stock this weekend.
So soon I'll be back in the game.From what I understand with the x299 platform is that overclocking can be a tricky thing, power and heat can be a issue.
Lots of cores at high speed is awesome, however does come with drawbacks.
When oc'ing on this platform thermal and power throttling can reduce speed even if the multipliers appear to be kept up. It seems that performance can be automatically internally reduced to keep stability. The cpu's turbo features can also fluctuate depending on thermals/power limits, this may be part of the reason for your results. Bios's are immature etc.. so in time things will likely get better. I'll post some of my experiences too.I have not been sleeping re benchmark, a few 100 lines of code have been re-written.
A few more things to tidy up and I'll upload a new revision soon. The Cache drive part had me stumped, I have spent a good few evenings just trying to figure that part out, however via the command line it is not easy. I finally do have a fair work around though so I will be implementing that. I'm a bit of a perfectionist on this project (no one would guess looking at the messy code). However I'm really trying to get it working correctly rather than just release something buggy.
I'm glad it's proved of some use so far and nice to see the spreadsheet:)
Ivan -
I tried to run the alpha benchmark - no luck but a few notes for Ivan...
I'm using steam so I get a floating licence (much more useful for us).
To fire up RC via steam you use "steam.exe -applaunch 489180 "...
Steam version doesn't like these commands:
-load scene.rcproj
-addFolder folder_name
-add image_name
So no times - Can I do these steps manually and extract the values from the info panels?
Thank you
Jennifer
-
you need to install the demo version.
and I don't think any dual video cards have been tested yet, so you'll probably find issues.
-
ok so I threw 687 video frames into the folder, and the demo version runs and gives me
NSRjecross
12:46:49.52
248
20
216
12:55:24.72
NumberOfProcessors=2
Name=Intel(R) Xeon(R) CPU E5-2623 v4 @ 2.60GHz
NumberOfCores=4
NumberOfLogicalProcessors=8
Name=NVIDIA GeForce GTX 1080
TotalPhysicalMemory=137357156352
Speed=2400
Model=ATA ST2000DM001-1ER1 SCSI Disk Device
Size=2000396321280
Model=ATA SAMSUNG SSD SM87 SCSI Disk Device
Size=512105932800
Size=0
Version=10.0.15063
PeakUsage=20
ECHO is off.notes: I have 2 gtx1080 (8gb) GPU (not sli) And the pictures were pretty horrible - if this matters to the benchmark times, I can run some other sets or use a standard group if there are some
-
Many Thanks Jennifer & chris. - All feedback is appreciated.
It's useful to see how it runs on dual Xeons etc, and multiple gpu's as I currently have only 1.The results are not as I expected, Only 3 times are recorded. - did it actually generate a completed textured model ?
I assume you used the version 0.2 ( on this page, I should tidy up the links, to avoid confusion)
The good news is my workstation arrived today, so once setup I'll get working on uploading a new version, which should deal with multigpu, etc. -
Hi Ivan - Thanks for all your work.
Yes, it seemed to generate a model, though that set of images does generate two components (only did the model for the major component). It is the same image set I've linked over in the green colorize bug report if you want to play with them.
Are we maybe hitting your "too fast time limitations? the generated model is only about 100k polys. I've got a mineral sample that generates models with ~40 million tri I can re-run in your benchmark script if that would be better. (https://sketchfab.com/models/c76deda5006744ba9f1a5129750b1a48)
-
Hello Ivan,
Thanks for your continued efforts. Though I'm past learning anything new in comparing performance in my previous and upgraded system specs, I saw utility in using the benchmark to isolate against a particularly vexing issue I'm encountering in the Promo version, described here, wherein I'm a) able to align a set of 500 images with default settings, and b) unable to reconstruct the component, always crashing RC with "invalid function call" red flag of death. I wanted to see how the Demo version might behave. Note, I had uninstalled Promo and reinstalled, as well as cleared cache and Shift-hold opened RC Promo to reset settings, so that's about as tabula rasa as you can go, though I'll say I suspect settings were not reset, as I saw my assigned cache drive letter hadn't changed. This goes off topic, so back to this thread, but thought it useful to provide context, to possibly understand the strange results I got in your benchmark.
It took me a few steps to get everything working, not sure why it downloaded the Demo version over Steam this time, used the same installer I had saved to my downloads folder as used previously. Every time I'd launch benchmark.bat with no indication that Steam was doing something, I'd get the txt file open with nothing moving forward, had to hunt around to finally get Steam to load everything and ready RC Demo.
Once I got benchmark.bat to open RC Demo and step through the workflow, I kept an eye on alignment, reconstruction, especially when it came to comparing depth maps, as this is always where I've run into problems leading to the invalid function call issue. Interestingly, RC Demo was able to get through depth maps to then reconstruct all the parts, though I missed the show that followed, as that took place in the wee hours. I awakened to RC closed, isn't that different from before, thought I remembered seeing the finished model, maybe not. Here's the results.txt which shows echo is off. (Why?)
I surely don't need to derail this thread with a discussion so overly specific to my issues, which I suspect relate to the combination of something generated by RC (a bug?) and only manifesting during work in RC, but operating at the system level, since reinstalling Promo and running Demo also present problematic behavior, though not identical. That said, I do believe my sharing this with you might trip your understanding of what possible culprits belong on the table, am all ears what you make of this news and possible suggestions moving forward. I'll then move along, take what I learn back to the other thread with Kevin Cain and Götz Echtenacher.
Again, many thanks for this project.
-
Hi Benjy
Thank you for the feedback. As I change things to improve, I inadvertently break the code functioning somewhere.
I know just because it works for me, is not enough, so having detailed responses like yours is helpful.To try and answer some of your questions.
The benchmark firstly tries to load the steam demo version, if that is not present, then it will attempt to load the command line version. or command line demo.
If things are installed in non default directories.. then for the moment it won't work.
I currently only have demo's so I need to rely on others feedback, or just guess how it may behave.Regarding the cache drive. As I am unable to query the application as to what the cache drive is set as, (although I could set the drive letter by command line)
I added the user selection option. This does not change what the application uses. It just records the drive type selected.
I spent a long time trying to figure out how to define the drive type by drive letter, however I could not. Maybe using diskpart is possible, however I purposely have used only commands that interrogate, and none that could by accident change someones partitions etc.Regarding the finished model at the end. I require that the application closes when the final texturing stage is done, so as to know when to allow the code to move onto the next stage and allow the records to be compiled. - The file is however saved, so you can reopen the completed project. and view the model. It is currently simplified to 1 mil polys as a arbitrary value.
Regarding the Echo is off, - this happens when the tool cannot obtain the benchmark results, or fails to actually run the application to run the benchmark. It's just my poor coding's way of saying no result found.
I do not know when you downloaded the last benchmark tool, however yesterday I uploaded a couple of revisions to the V4. v42,v43 - The first upload of the V4 was reported by the capturing reality team to generate the same Null values as you experienced. A recent windows update slightly changed how it allows the taskmanager to be queried and I also added a space where there should not be one. - Do you know which you downloaded ? It's hopeful the latest code I put up yesterday may have already fixed the issue you are having.
Hopefully you have time to download the latest v43 and try again. - Maybe just chuck 10 images in for a quick run to see if it completes correctly. Then you can play about with your real data set.
As per the issues you are having not regarding the benchmark. I have found that windows profiles can get corrupted, and no amount of re-installing applications helps. However creating a new user profile allows for a fresh registry for that user, and has often allowed me to get things working without a re-install etc. It maybe worth a try.
All feedback is appreciated :)
Ivan -
Thanks, Ivan, that all makes good sense. I've linked to your .43 version, will try again. Interesting what you say about a corrupted Windows profile, will create a new one to see how that works out. Could RC be the cause of that corruption, or is this encountered in Windows generally? I've only recently moved to Windows from Mac OS.
-
I doubt RC it'self would cause issues, Even windows updates themselves can make things go weird, or uninstallers, don't fully uninstall something. It's worth a try for the 5 mins it takes.
-
Alpha 0.43 Can be downloaded here
The Benchmark requires CLI access, so either a demo, steam demo, or the full license. (promo is no go).
You can quickly install the steam demo alongside the promo, without issues.Instructions:
1) unzip
2) Choose your images and place them inside the newly created /CapturingRealityBenchmark/Images folder
For now I'd suggest a smaller collection that you know work. - None are currently included.3) Run the benchmark.bat file
You will be asked to enter a identifier/nickname at the start, asked for a note if you wish, and to select the drive you use as the cache disk.4) Sit back and relax
5) Once the benchmark has run it's course you can review the results in the results.txt file which should pop up automatically. And look like this
-
Hello Ivan,
I tested on 0.43 and my project made it through:
So, it's something about my Promo version, after it being uninstalled and freshly installed. I tried under a new user, no go. I did notice under the new user that the preference for cache location in RC persisted from the other user, so does that imply something may still bleed through that's causing the crash? How can one be sure everything is uninstalled when there appears only to be one crack at it? Thanks.
Benjy
Please sign in to leave a comment.
Comments
101 comments