Hardware Optimisation & Benchmarking Shenanigans
AnsweredHi
As we all know taking as many good pictures as possible is the best way to ensure things go well, and ultimately will save lots of time back in the office.
However time is allways against us. As fast as Capturing Reality is, waiting is inevitably part of the game. And it can be a long time, days+ to only find out your photo shoot was not good enough.
hardware requirements are listed here https://support.capturingreality.com/hc/en-us/articles/115001524071-OS-and-hardware-requirements however it is very vague. From experience, with rendering & video encoding etc, a badly configured system which looks great on paper can perform 1/2 as fast as a similarly priced system with carefully selected components and optimised appropriately. Throwing more money at the problem is not always the answer. And at times can slow things down.
Various stages of the calculations, stress different parts of the system. However to what amount I am struggling to figure out. how can I/we optimise a system that will perform the best with the software.
I have recently got rid of my dual Xeon v3 28 core workstation, which for rendering was awesome, however in reality capture it was painfully slow. A much higher clocked, new architecture consumer skylake system is not hugely different in reality capture yes a little slower, yet 4x+ slower for rendering (cinebench). cost 5x+ less and has 4 vs 28 cores.
Below are the areas which I know can make a difference. Unfortunately as with many things we cant have our cake and eat it. Cost has a big influence and also technological system restrictions mean you can have 32gb of very fast ram, or 128gb+ of slow ram. You can have a 5.2ghz 6core cpu or a 3.6ghz 16core cpu.
- Cpu speed Mhz (IPC) - More is better always.
- Core count (threads) more is better - to a extent, and not at the cost of IPC. From my experience a dual cpu system worked awesomely with some applications, however the system architecture did not agree with other applications such as capturing reality and underperformed. I have a feeling that in the same manner as GPU's you get increasingly mitigating returns with reality capture when increasing core count, even if the cpus are maxed 100%.
-CPU - Instructions support (AVX etc) does reality capture take advantage ? or will it soon and to what extent ? I see you are looking for a avx coder. Has avx been enabled when the software is compiled or tested ?
I personally am wishing to build a new system, AMD offer good value CPU's at last with Threadripper and EPYC, however do not support AVX well at all. It would be a disaster to invest in the wrong architecture. I am aware that amd hardware does not perform ideally with other software due to lack of AVX. Is this/will this be true with capturing reality ?
-GPU count - 3 is max and as with most things, you get diminishing returns.
-GPU speed/cuda performance - 1080ti/titan/quadro etc are the goto cards with ti's being best bang for buck. The new Tesla V100's are compute monsters with a cost to match. Soon* we should have the consumer volta titans and gaming cards available.
-GPU memory, is 12gb enough ? Reality capture complains I do not have enough video memory frequently - maybe this is a bug, as my monitoring software says only around 1gb of memory is used.
-RAM amount - Reality capture is fantastic that in theory it doesn't require massive amounts like competitors, however it does have it's limits. - what impact does maxing the ram, and requiring swap file usage on performance have ?
I have encountered out of memory in reality capture many times, is throwing more ram at the system the best solution?
-RAM Speed, 2666mhz or 4400mhz ?
-RAM Latency - ties into the above, some apps love faster speed or tighter timings. ? from my experience, optimising cache and memory performance for cpu/ram can double the speed of certain applications. has this been tested ? - there sure is a lot of data being passed about.
-HDD for cache/virtual memory. latency vs speed. I expect this is less important, however every bit will count to a extent. I assume when ram limitations are hit this becomes more valuable.
From all the above it's easy to choose the best, but you can't you'l have to sacrifice one area to get the max performance in another.
So the solution
Benchmark datasets - I searched the forum and found others have mentioned the availability of a benchmark and even stated they will be creating one, however this was a year+ ago and nothing came of it.
Unless a integrated benchmarking tool is to appear in the software very soon (would be best) I propose to do the following.
Have 2 different datasets available to run to reflect varying workloads. (I can make some, - we could utilise data provided capturing reality, or maybe someone can suggest something suitable)
a) light dataset - will be fast
b) Heavy dataset - will take longer, however may give more accurate results.
Users will then shift start the application, and hit start. Theoretically everyone should be on the same level.
Users will be required to upload the contents of the logs created to either the forum thread, or ideally a google form I create.
The easy part - RealityCapture.log this is basically a duplicate of the console window and logs the timestamps for the various stages that complete. it should be located here: c:\Users\USER\AppData\Local\Temp\
It pumps out the following as a example.
RealityCapture 1.0.2.3008 Demo RC (c) Capturing Reality s.r.o.
Using 8 CPU cores
Added 83 images
Feature detection completed in 11 seconds
Finalizing 1 component
Reconstruction completed in 31.237 seconds
Processing part 1 / 5. Estimated 1225441 vertices
Processing part 3 / 5. Estimated 38117 vertices
Processing part 4 / 5. Estimated 926526 vertices
Processing part 5 / 5. Estimated 538277 vertices
Reconstruction in Normal Detail completed in 232.061 seconds
Coloring completed in 30.105 seconds
Coloring completed in 0.116 seconds
Coloring completed in 30.363 seconds
Creating Virtual Reality completed in 294.092 seconds
The trickier part- system analysis. There is a nice little freeware tool called hardwareinfo, that does not require installation. and can spit out a nice little text report as below, It contains no sensitive info. These two logs combined I believe will contain all the required information needed for us to compile a nice comparative dataset. When I say we I mean me, I'll have to parse the data onto a google spreadsheet which will do the calculations and we can all see the results.
CPU: Intel Core i7-6700K (Skylake-S, R0)
4000 MHz (40.00x100.0) @ 4498 MHz (45.00x100.0)
Motherboard: ASUS MAXIMUS VIII HERO
Chipset: Intel Z170 (Skylake PCH-H)
Memory: 32768 MBytes @ 1599 MHz, 16-18-18-36
Graphics: NVIDIA GeForce GTX 1080 Ti, 11264 MB GDDR5X SDRAM
Drive: Samsung SSD 850 EVO 500GB, 488.4 GB, Serial ATA 6Gb/s @ 6Gb/s
Sound: Intel Skylake PCH-H - High Definition Audio Controller
Sound: NVIDIA GP102 - High Definition Audio Controller
Network: Intel Ethernet Connection I219-V
OS: Microsoft Windows 10 Professional (x64) Build 15063.674 (RS2)
I'll need your help :)
A) input from my wall of text above.
B) Suggestions on the proposed benchmark & setup.
C) To run the benchmark and post the results.
If you've read through all that and think - "yeah, I'd spend 15 min running the test files and report back". Please say
If you've read part of it and fell asleep thinking , "aint nobody got time for that" - Please say :D.
What we get out of out all this?
Eventually when/if enough people with varying hardware post the results. We can determine what areas to spend our precious money on to improve the areas of capturing reality that we are bottlenecked in. Which components and configurations - help with say reconstruction or texturing the most, and what hardware is just ineffective.
What say you ? Do you think this is a worthwhile task and should I proceed ?
-
I agree with ivan's second to last sentence! :D
For a first try, I wouldn't overcomplicate it and take something that is already there.
Like ShadowTail suggested at the beginning with 36 images.
It's about working out the systematic approach.
Wow can come later.
ShadowTail, I like the object, but how is it about the copyright?
Also, wouldn't a round object (as in shot in closed circles) be better for the beginning, since there are less problems at the (non existing) borders?
-
I'm late to this thread, what an awesome development, count me in. I'd have to check on rights, but recently captured in a Siberian salt mine, highly occluded environment, not so much the mine walls which are smooth, but sections with human stuff, a burly mining machine, an area where miners eat lunch, lots of texture-rich tools lying about, old dial telephone on the (psychedelic) wall. For Wow factor, this place hits you in unexpected ways, surprise, complexity, natural beauty, human authenticity, while also presenting the right challenges, things, interior space, occlusions, etc.
This image comes off the internet, gives you an idea.
The source data comes from 42 MP Sony A7Rii w/ 21mm Zeiss Distagon prime, we'd simply downscale to 10 MP. I can share an animated clip out of UE4 offline (password protected) to show how some of these scenes from RC appear in deliverables.
In any event, count me in to participate with benchmarking. I've been planning an upgrade, am aware how key benchmarking is, especially when tied to specific apps and separate functions within.
Benjy
-
You are welcome.
Don't be fooled into thinking we are experts or have the remotest clue what we are talking about or doing.
You are more than welcome to add your 2cents.Fear not, the intention of the benchmark is as you hope for.
There has been a lot of talk from me, and not much evidence of my web-based results page. I have struggled immensely with that part. So many solutions which claimed to be able to offer the ability to upload and then show data were failures and did not deliver. I think I have it cracked.. Mostly.
This is now as a good a time as any to share where I am. There are currently 2 parts.
1) the upload, and 2) the public results.
Getting the publicly viewable results shown in a clear and presentable manner that can be analysed and interrogated was a important part.
Here is where I am with that. The data is drawn live from the google spreadsheet from which results are uploaded too.
And updates accordingly. The pie charts etc are not final and I will change the metrics displayed/used. It's just a test to get it working. And will have more useful data shown for your viewing pleasure..
Note. the contents are fabricated by me changing the rawresults.txt files that have been uploaded each time, and don't represent real results yet.https://datastudio.google.com/reporting/1LVbEcggzC87TWXaKTDczwks2pRLM51b8
The uploading part is currently not as pretty. (which will change.) is here
https://script.google.com/a/ivanpascoe.com/macros/s/AKfycbwQi8gvGNy83YEhrNZykm_uLJwgUbGOdrSnauWJC1FNrLE8OpJL/execI'd very much appreciate anyone to try uploading some data. Using the rawresults.txt that is generated by the benchmark. please use that file rather than the results.txt as it will add garbage to spreadsheet, I have not yet added code to reject the incorrect file. Yes the results will be kind of useless as we are all using different datasets for now, however for the moment I need help with checking that the uploaded process is working correctly, and the results are displayed properly.
known issues.
1) Results are show instantly on the upload page, however can take a min+ to appear on the pretty public results. And you will need to manually refresh the page for your uploaded data to appear. This is a limitation of the platform. It caches data on the server to save on resources. Poor google and their lack of resources...2) Works on chrome, I do not know about other browsers.
3) The Rawresults.txt must be selected for the upload or terrible things may happen (not that terrible, but will make a mess on the spreadsheet with garbage data)4) The chart is full of made up data . For now the fact that results can be generated, uploaded, displayed and analysed is the important part.
5) You can download the results for your own analytical pleasures, there is a hidden button next to the word " Total."
As always. Feedback is really appreciated.
-
Hey ivan,
awesome post!
I say YEA :D
There were several attempts like this but it really needs somebody to gather all the strings.
By looking at each stage separately, it should also be possible to figure out which HW component can improve which step of the process.
Do you know of any tool that can monitor HW usage during processing?
-
Hi Gotz the hwinfo tool mentioned above can also generate log files of various system resource usages over time. Which can then be plotted to graphs. Going deeper than that are profiling tools but this starts to get too complex and is really of use to the coders of the software.
The new windows fall creators update released today now also finally monitors gpu usage in task manager along side ram and cpu.
I am yet to test.
-
oh, that is nice - do you know by chance if that's also true for dinosaurs with win7?
I have hwinfo installed and use it frequently, so that should not be a problem - I must have missed it in the depths of your über-post! :-)
Awaiting your image set! Maybe we could start with a small-ish one to iron out the chinks?
-
Something else to concider:
RC seems to have a randomness in the alignment process, which means the results can vary, the amount depending on the image set. So I guess it would make sence to run the alignment more than once after deleting the older component...
-
I can possibly provide a very small dataset of only 36 pictures for an object that should all align into a single component using the default settings for RC.
-
Sounds like a good start!
I guess it would make sense to ask RC if they can host those images so that people can download them any time if they want to.
-
Thanks Shadow & Gotz
- RC has some demo data already, which is accessible via the 'help' files. It contains both Laser and regular camera data. https://support.capturingreality.com/hc/en-us/articles/115001484992-Dataset-samplesI wonder if this would be suitable, and I also wonder if the CLI dataset they have available, may be more suitable, and even be possibly accessible to us.
-
Hmm, has anyone looked at those yet? Are they practical?
-
sounds interesting.
I think its important to benchmark each stage. for alignment, reconstruction part 1 ie depth maps (gpu), part 2 creating model (cpu). then texturing. and maybe even simplify etc...
I'm mostly interested in seeing part 2 of reconstruction. since its by far longest part for my scenes.
but I'm not sure how useful seeing a few hundred photos will be. you won't see how ram or ssd's really effect speeds it all until you get into 2500 - 5000 photos. that will take a long time to benchmark though but would be more interesting.
this sounds like it could all be done with cli scripts and the demo version.
-
You are correct chris - gaining data from each stage is important.
Do you think CLI scripts can work with the demo ? If so that would definitely be the way forward.
-
I'm pretty sure all the cli commands except export work on the demo version.
I have no idea if all the right logs can be saved. i would assume so.
it would have to all be done in one go though, as you can't saves any of the parts.
maybe it worth seeing if all the right information is saved in the logs by just running the start button.
-
Hey guys,
why would it be neccessary to use CLI? I think we are all motivated enough to search out all the log files and copy-paste the info into a table, right? And I'm not sure what would happen if we install the demo parallel to the normal versions. I don't think I'd be willing to mess around with that...
Chris, you are probably right that a larger image set will give us better or at least different results. But I still think it would be important to try and optimize the method with a smaller project first, just so nobody gets distracted by super long processing times...
-
Hey guys,
that is a great initiative. Thank you very much. We like it very much in CR and an benchmarking tool has been on our minds for a long time. We want to support you in this initiative as much as we can.
There are more ways how to do that.
One of the way is : In the workflow tab in the settings here is "Progress End Notification". Read the application help for the detailed information. You can attach a bat file to the notification that will do the job you need. Somebody needs to make the scripts etc.I think that the CLI is the way as it work in the demo. It can works as follows:
1. Clear cache
2. Export and backup global settings "-exportGlobalSettings"
3. Import your global settings "-importGlobalSettings" that will include also the "Progress End Notification" hooks.
4. Run the tests
I think that the cache clearing as well as the identical global settings are the most important so that eavrybody will have the same starting conditions.
As you mentioned already the dartaset is also very important. I would recommend to use a dataset of 300-500 images of ~10Mpx resolution of some relativelly complicated structure, however, as it will be exposed publicly, it should have a wow factor. We can try to find some of our but it can take some time. If you have some the go and use it. -
Hey Michal,
thanks for the encouragement!
Is it possible to install a demo parallel to a "proper" license?
I don't have CLI capabilities here on my system...
Ivan, we're waiting for you now! ;-)
-
Hi Michal
Many thanks for the supportive words.
I was unaware of the progress and notification was able to do that, I shall take a good look at that and read up on scripting required.
I agree that the dataset should have a wow factor as it will encourage people to use it widely, and have a positive impact on the overall image of the software. Something impressive is required, visually and technically.
- Suggestions: What do people think would be best ?
Humans ( would need someone with a multi rig setup to donate the data), Getting a really good capture appears difficult.
Architecture, internal or external. There are some very beautiful buildings about that are accessible, which can have very nice intricate details textures and features. My testing has had some awesome results.Ideally a combination of Nadir(above) and ground images makes for the best solution. Which I have found to be tricky as uav's and local authorities don't always mix nicely :).
Statues/Monuments can work well and can look nice.
heritage items/scenes usefully have fantastic detail and textures.
Maybe something organic could be good.
The subject matter is endless..
Another option I was also thinking of was a test scene setup (example from dpreview), with various objects/models placed on it.
Similar in a way to this, however totally 3 Dimensional and a lot more visually interesting.Interesting objects could be arranged from a variety to cover most subjects people would be interested in.
Such as a lobster shell, some wood & stones /leafs , highly detailed miniature architectural models, engineering components etc, and 3D measuring guides.
If created well. Not only could the benchmark scene be visually impressive and be used for testing performance. It could also be used to compare the effect on changing settings/quality/accuracy etc in a measurable and controlled way. Even between software revisions. Photographing it would have the bonus that images taken could be very good and accurate due to 'studio' conditions. Avoiding occlusion between objects could be very tricky unless well planned out. Carefully selecting the various components would also take some thought.Perhaps I'm over complicating the situation. ?
Anyones input and suggestions are greatly welcomed.
-
An object like the one shown in the linked video may be something that has the required WOW factor.
https://www.youtube.com/watch?v=NWuJPENRQCU
I have tried to do a reconstruction from that video and it turned out absolutely amazing despite the relatively bad quality of the source material.
-
Relief carvings etc, as Shadowtails example are indeed impressive works of art, and the software does a fantastic job of extracting the depth from them. However I don't think it shows the true ability of the software, and could give the impression it is about 2D depth mapping and not full 3D scene/object recreation.
The end result needs to be technologically impressive as well as visually complex and interesting.Using someone else's work is definitely no go from the start.
We definitely will be able to create our own data.To ensure the software is not being held up with strange issues, knowing the conditions of how the data was captured is important. Full Exif data etc.
Finding a versatile and impressive subject/s that will capture the essence of what the software is capable of shouldn't be too hard with a bit of brainstorming. I do not believe a small dataset of say 36 images will be enough to stress systems realistically to gather the data we require for a benchmark. Nor will produce a adequately high quality result to represent what is possible. It is very impressive what can be done with a few images, it is even more impressive what can be done with a larger number taken carefully.Gotz's point of closed circles makes sense, and having a full scene would be nice.
With regards to over complicating :) It is indeed a good idea to be able to walk before you can run.
That said, I feel if a job is worth doing, it's worth doing properly. If rushed and poorly thought out you don't achieve what you set out too, and end up with sub par project that is lacking in many areas. A good balance is important. If you bite more than you can chew there is the risk things do not get completed.I'll have a ponder over the next few days. Keep the suggestions coming :)
-
Hey ivan,
good point about a worthwhile project.
I just wouldn't want this project to peeter out because nobody gets around to doing a project with 500 images just so.
So I would say we can do it parallel - looking for a nice image set that already exists and the result is known, and whoever feels like it can go out and shoot to impress! :-)
Michal, how about the stag in your showcase - would that be suitable/available?
-
I'm faraid that the stag will not be possible. I'll try to find something and if not then grabbing a camera and capturing any tree trunk can be a start :)
-
I think having set of images from a uav flying a double grid, would be pretty good.
you can get a pretty decent model from 300-500 photos from that.
-
I just tried to combine WOW with ALREADY THERE. :-)
The stag would fit that, in my opinion... ...but that's not possible so moving on...
Who will be first? :D
@ chris: I think that the internal processing for a model like you suggested might be quite different from an "all around" one, so I guess it would make sence to have one like that as well, especially since it is a common application for photogrammetry...
-
I would actually suggest having multiple objects ranging from near-perfect source images to low quality / noisy / video source images to show what magic RC can do even with bad source images.
Ideally it would be the same object because that way they can be compared and the differences shown.
-
That does make sense Shadow, and not a bad idea at all. Although it would require a good amount of strict control to ensure set variables are consistent between the shoots. - More of a additional test to see how the difference that image quality makes. It could easily become quite in depth fast and ties into what i was suggesting with the test scene. And yes Chris also UAV flights can produce great results.
Jpg vs Raw
Resolution differences
Quantity of Images
Various ISO levels
Various Focal Lengths
Sensor Sizes
Bayer vs Foveon
Optical Stabilisation on/off
...... the list goes onThese are just some of the things we know make a difference, but to quantify them would be really nice.
I have been making progress. - On the technical side. Start with the least fun and trickiest bits.
I now have a script/batch file that loads up any set of images, goes through all the motions and spits out the following below as a txt file. No other apps required, just click and go.
I think the output is *mostly relevant. - Have I missed anything obvious ?
I couldn't interrogate the hardware deep as I wanted without external apps. - However I think if that can be avoided it will be far better.
Start time 1:07:29.43
Alignment Time 12 seconds
Reconstruction Time 48 seconds
Simplification Time 6 Seconds
Texturing Time 124 seconds
End Time 1:10:45.11
NumberOfProcessors
1
Name=Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
NumberOfCores=4
NumberOfLogicalProcessors=8
GPUName
NVIDIA GeForce GTX 1080 Ti
TotalPhysicalMemory
34272317440
Capacity PartNumber Speed
8589934592 CMK16GX4M2B3200C16 3200
8589934592 CMK16GX4M2B3200C16 3200
8589934592 CMK16GX4M2B3200C16 3200
8589934592 CMK16GX4M2B3200C16 3200
Model
Samsung SSD 850 EVO 500GB
Size 500105249280
Windows Version
10.0.16299
Pagefile Peak Usage
40Next stage is to parse the data onto a easily uploadable online *database*, which can then nicely display the results to us all.
Hopefully next week, we should have something to test out.
-
Hey ivan,
great work!
Is that CLI now?
Would it be possible to split the processes up into all individual steps? E.G. depth map calculation has different HW needs than modeling etc...
What more info would you want? I think it shouldn't be too complicated either since you can never have 100% comparable setups anyway. It depends on so much! I think what ShadowTail and you are aspiring to is it's own research project! :D
It would be useful to also see how much which component has been used but I guess that's what you mean about deep HW interrogation...
Could anyone answer my question concerning demo parallel to Promo?
-
Yes It's CLI based, so end users should get a zip file with pretty much the following structure inside.
Images Folder
Benchmark.bat (contains all the script and code)
Settings.rcprog (contains variables required by the script to be used in the application/benchmark and makes no permanent changes)
Results.txt/csv (Created as the bat file is ran - this will need to be uploaded to the database)
CompletedBenchmarkScene.rcproj (created when benchmark finished)
.I am exploring a different method also. As cli does make things potentially tricky if using a the promo, however being able to control all the functions is very handy indeed. It's all work in progress.
Parsing the exported data is testing me as multiple hd's/gfx cards/Cpus/ram sticks, can create extra lines and shift the results about so the results structure is a little dynamic depending on the system, so I need to figure that out. As well as some pathing issues.
For the moment, the substages within each calculation are not recorded, however I am working on that. However ultimately I think the required data can be extrapolated without those... I think.. maybe..
Recording the % of CPU/Ram used as a timeline is possible, however it makes the results file huge & complex, as poling is required to be captured constantly throughout the process and whilst interesting you just get a list of 10000's of numbers, you can get a better visual representation of what is being used at certain points by playing a game of watch the task manager :):We do indeed need to be wary of not undertaking a PHD in image analysis. :)
:Re the Promo/Demo question, If I recall It did not pose a problem for me when I tried last. Things may have changed, so at your peril...
-
Hi ivan,
so you think in MIGHT work even with the Promo?
Maybe we could sway Michal to provide testers with a short CLI license, that would aleviate this problem.
I get what you're saying about the CPU usage. Would it at all be possible to thin it out by using only every, say, 100th or 1000th value and ditch the rest? But it's your call since you do all the essential stuff. It's really great that you are putting in all that work.
I am planning on providing an image set as my contribution. I use a 12mp camera, so that would also fit Michals suggestion of around 10 mp. It's not high end at all but we are not trying to create the best model ever but a sound basis for a benchmark, right?
-
Benjy - what a interesting subject - I can indeed imagine that such a environment is quite surreal and beautiful in it's imposing and harsh ways. It would be great to see :)
I'd imagine those machines would work brilliantly due to the dusty and dirty environment giving a lot of texture. However they would also need a lot of images to avoid misalignment bugs for such a scene.
I found a lion skull (as you do) thinking that would make a exciting subject.. now not so much.. :)I also have exactly the same capturing equipment in my arsenal, so can vouch for the results that are possible.
- Off subject one thing I have found is the software does not support the sony raw files so have to pre-convert them to tiff or similar beforehand. I did at one point manage to get the software to read them, however I believe it was extracting the jpg preview from within the raw and not the true raw image data itself.Gotz - I don't think it will work directly with the promo (Part of the reason the promo has the more accessible price over the full editions is the fact that automation is disabled), however I am pretty sure I installed the demo alongside when I had the promo installed, and then was able either run side by side, or just uninstall the demo after and the promo would resume fine as before. It was a few months back so can't remember exactly - I'm pretty sure it worked fine. For the moment I won't be adding the cpu% stuff as the parsing of the data & coding is testing me enough as it is. I have as a proof of concept got it working - dealing with the output's is another matter. So in time maybe.
Frustratingly I am in between systems at the moment and am awaiting a new workstation, however will take a week+ to be built/arrive, so cannot test at the moment.
Please sign in to leave a comment.
Comments
101 comments