Most efficient method for export/import of non-traditional tensor-style dimensions of matrices/tables/lists-of-lists-of-lists(-of lists)?How do I save a variable or function definition to a file?How to save all data in all variables so that loading it is fast?Fast way to export large amount of data in “Table” formatExport of (large) Dataset lasts very longHow to develop an Import/Export converter for Compress[]ed data?Exporting lists with two extensions to filesRunning out of memory when filtering a large data setImporting several huge files with headers & footersSaving and loading lists of interpolating functions fast and cross-platformWhat's the most efficient way to batch-import, batch-process and batch-export images?How to export nontrivial data to h5?Warning? Export [“…mx”] not downward compatible from Mathematica 11.1.1Specifying that LocalObject should use MXimport list of lists list-wise
I think I may have violated academic integrity last year - what should I do?
How is character development a major role in the plot of a story
Crossword gone overboard
Is CD audio quality good enough for the final delivery of music?
Why did this prime-sequence puzzle not work?
How do you deal with an abrupt change in personality for a protagonist?
How do I subvert the tropes of a train heist?
What are the problems in teaching guitar via Skype?
What is the most important source of natural gas? coal, oil or other?
How to prevent bad sectors?
How to capture more stars?
Is there any use case for the bottom type as a function parameter type?
How to extract lower and upper bound in numeric format from a confidence interval string?
Future enhancements for the finite element method
The Passive Wisdom (Perception) score of my character on D&D Beyond seems too high
What is the best linguistic term for describing the kw > p / gw > b change, and its usual companion s > h
How can I prevent interns from being expendable?
Modern approach to radio buttons
How does apt-get work, in detail?
1960s sci-fi novella with a character who is treated as invisible by being ignored
What are the benefits of cryosleep?
Terminology about G- simplicial complexes
Plot exactly N bounce of a ball
Preserving culinary oils
Most efficient method for export/import of non-traditional tensor-style dimensions of matrices/tables/lists-of-lists-of-lists(-of lists)?
How do I save a variable or function definition to a file?How to save all data in all variables so that loading it is fast?Fast way to export large amount of data in “Table” formatExport of (large) Dataset lasts very longHow to develop an Import/Export converter for Compress[]ed data?Exporting lists with two extensions to filesRunning out of memory when filtering a large data setImporting several huge files with headers & footersSaving and loading lists of interpolating functions fast and cross-platformWhat's the most efficient way to batch-import, batch-process and batch-export images?How to export nontrivial data to h5?Warning? Export [“…mx”] not downward compatible from Mathematica 11.1.1Specifying that LocalObject should use MXimport list of lists list-wise
$begingroup$
Hello World!
(I did it for my first answer, so it only feels proper...I'm not nostalgic, you are!)
OKAY, here we go!
Let's start off with the good-stuff, here's my code, made relatively arbitrary, as I cannot (yet) share the actual code, though I will be happy to test methods with my real code and provide, in the end, a nice display of the collated methods and their efficacies, with attributions to those who contribute:
YourMainFunction[v1_?NumericQ,v2_?NumericQ,v3_?NumericQ,d1_?NumericQ,n_?IntegerQ]:=
(Export[NoteBookDirectory[]<>ToString[N[v1,8]]<>"v1"<>ToString[N[v2,5]]<>"v2"<>ToString[N[v3,5]]<>"v3.wdx",
Transpose[SortBy[Transpose[Eigensystem[N[f[v1,v2,v3,d1,n]]],Re[#[[1]]]&][[n+1;;2 n]]]);
On the surface, I am running an Eigensystem calculation for a function of 3 arbitrary inputs of numerical values that is size n-by-n, and I output the following format:
v1, v2, v3,eVals,eVecs
It must be mentioned that while the eVals are real, the eVecs are complex, and must be kept this way, if remotely possible.
However, I do realize I can input the parameters into the filename, and only contain the Eigensystem outputs in my exported files, like so:
expr=Transpose[SortBy[Transpose[Eigensystem[N[f[v1,v2,v3]]],Re[#[[1]]]&][[n+1;;2 n]]];
Export[NotebookDirectory[]<>ToString[N[v1,8]]<>"v1"<>ToString[N[v2,5]]<>"v2"<>ToString[N[v3,5]]<>"v3.wdx",expr];
Which resulted in the above coding attempts.
I will do this 10,000 times (100-by-100 combinations of v2 & v3) before I change the v1 value, or vice-versa, as there will be 1 main parameter, whilst the others are cycled through all combinations, I do this in a variety of ways, often either ParallelSubmit(most efficient) or ParallelTable(best for reimportation of ParallelSubmit-created data), but currently on a dual-core machine, I am unable to load the sheer volume of files in any feasible manner that can be completed in the length of a day let alone over a research meeting, and in-fact put too much stress on my hexa-core mobile workstation (though it does work! [albeit slowly]), and I only find comfort in my CPU load percentage when using a 16-core behemoth I keep caged at home.
Each of these are stored separately, as this is best for stopping and starting mid-export. I then export the master list of the filenames of the exported files (this filename is what Export will output), but this can easily be recreated and properly collated if need be. It does not seem advantageous to reassemble the intended set format each time when importing, nor does it aid in the subsequent analysis of these datasets to be required to do so. I would also not like to lose accuracy due to truncated values in the filename, nor have lengthy strings for the filename. While I could investigate Parallel Computing methods using this post, I feel this has been addressed quite well already, and I recognize my error is likely in the method of how I import and export, and what file formats I use, as the used format .wdx is slow, but cross-system compatible.
I know the following:
- .wdx is appropriate for cross-compatibility between systems
.mx is faster, but only able to be loaded on those systems with matching $SystemWordLength
The most efficient method of exporting is using DumpSave and Get, but it doesn't work for pushing out a non-square table, nor is it entirely applicable for compatibility across systems or versions of the WL & Mathematica.
My question, made gigantic:
What is the most (more than above) efficient (fast, low memory-impact[we can clear it every time, but I am not even aware of what function to investigate for this], low-core-count-requirement) method, given my above code & following intended use, for the export and import of tensor-style (non-traditional dimensions) matrices, or lists-of-lists-of-lists(-of-lists)?
In other words, what is the right format to use in this case, in order to provide for quick export and import of many (~10,000+) files containing matrices of non-traditional tensor-style dimensions, id est, lists-of-lists-of-lists(-of-lists)?
Please note, prior to answering, that this must be a method that is cross-compatible, and appropriate for arbitrary systems, what I mean by this is something like a dual core should be able to load it (assuming enough available memory, of course[if you can solve this too, you are welcome to include it in your answer!]) in a feasible amount of time, thus allowing a 4+ core setup to load it faster, albeit with similarly impacted memory usage.
Here are some other, unmentioned, resources which I found useful, but I am left with a lot of older methods, and nothing that is "current" in such a way that I am unsure if .wdx remains a current format not yet replaced by .wxf or .wdf, what came first the chicken or the egg, or even if .mx is indeed cross-compatible [barring the aforementioned $SystemWordLength] (perhaps we can send around the same dataset and see if it stays consistent(I wonder how the CheckSum changes...don't answer that...but that would be fun wouldn't it?...)):
Relevant References & Resources:
- How do I save a variable or function definition to a file?
- Save expression and load them into another notebook efficiently?
- Fast way to export large amount of data in "Table" format
- How to save all data in all variables so that loading it is fast?
If this seems like an unresearched post, or may be incomplete, I've likely missed something glaring, so please let me know where I can provide clarifications or improvements, as I wish to have a resource for all through this question and subsequent answers, which will alleviate future concerns as to the most ideal method of export and import of lists-of-lists-of-lists(-of-lists) of non-traditional tensor-style dimensions. Also it is very late, and I need to sleep, but I will edit this message of sleep away in the morning tomorrow, in hopes of providing further clarification, or to begin collating time & memory tests.
I've only just stumbled upon the above methods of compression and binary exportation, but the current ability to understand this escapes me....I fear my solution is there, and I may just be posting this only to be marked as a crazy-overdeveloped-duplicate question, but it would be awesome if this method was revitalized here, using the newest methods that have been determined by the community, hint hint...maybe?
Also this is super duper relevant, though it is not cross-compatible...but I need sleep, for real!
Thank you in advance to all who take the time to read through this with a serious eye, I hope that my lack of brevity is not too lax in the way I express my sheer passion and enjoyment of Mathematica & (the) Wolfram Language within this question post. Everyone here is awesome, and I am very excited to see what inputs you all have for this. Enjoy!
performance-tuning export import data data-structures
$endgroup$
add a comment |
$begingroup$
Hello World!
(I did it for my first answer, so it only feels proper...I'm not nostalgic, you are!)
OKAY, here we go!
Let's start off with the good-stuff, here's my code, made relatively arbitrary, as I cannot (yet) share the actual code, though I will be happy to test methods with my real code and provide, in the end, a nice display of the collated methods and their efficacies, with attributions to those who contribute:
YourMainFunction[v1_?NumericQ,v2_?NumericQ,v3_?NumericQ,d1_?NumericQ,n_?IntegerQ]:=
(Export[NoteBookDirectory[]<>ToString[N[v1,8]]<>"v1"<>ToString[N[v2,5]]<>"v2"<>ToString[N[v3,5]]<>"v3.wdx",
Transpose[SortBy[Transpose[Eigensystem[N[f[v1,v2,v3,d1,n]]],Re[#[[1]]]&][[n+1;;2 n]]]);
On the surface, I am running an Eigensystem calculation for a function of 3 arbitrary inputs of numerical values that is size n-by-n, and I output the following format:
v1, v2, v3,eVals,eVecs
It must be mentioned that while the eVals are real, the eVecs are complex, and must be kept this way, if remotely possible.
However, I do realize I can input the parameters into the filename, and only contain the Eigensystem outputs in my exported files, like so:
expr=Transpose[SortBy[Transpose[Eigensystem[N[f[v1,v2,v3]]],Re[#[[1]]]&][[n+1;;2 n]]];
Export[NotebookDirectory[]<>ToString[N[v1,8]]<>"v1"<>ToString[N[v2,5]]<>"v2"<>ToString[N[v3,5]]<>"v3.wdx",expr];
Which resulted in the above coding attempts.
I will do this 10,000 times (100-by-100 combinations of v2 & v3) before I change the v1 value, or vice-versa, as there will be 1 main parameter, whilst the others are cycled through all combinations, I do this in a variety of ways, often either ParallelSubmit(most efficient) or ParallelTable(best for reimportation of ParallelSubmit-created data), but currently on a dual-core machine, I am unable to load the sheer volume of files in any feasible manner that can be completed in the length of a day let alone over a research meeting, and in-fact put too much stress on my hexa-core mobile workstation (though it does work! [albeit slowly]), and I only find comfort in my CPU load percentage when using a 16-core behemoth I keep caged at home.
Each of these are stored separately, as this is best for stopping and starting mid-export. I then export the master list of the filenames of the exported files (this filename is what Export will output), but this can easily be recreated and properly collated if need be. It does not seem advantageous to reassemble the intended set format each time when importing, nor does it aid in the subsequent analysis of these datasets to be required to do so. I would also not like to lose accuracy due to truncated values in the filename, nor have lengthy strings for the filename. While I could investigate Parallel Computing methods using this post, I feel this has been addressed quite well already, and I recognize my error is likely in the method of how I import and export, and what file formats I use, as the used format .wdx is slow, but cross-system compatible.
I know the following:
- .wdx is appropriate for cross-compatibility between systems
.mx is faster, but only able to be loaded on those systems with matching $SystemWordLength
The most efficient method of exporting is using DumpSave and Get, but it doesn't work for pushing out a non-square table, nor is it entirely applicable for compatibility across systems or versions of the WL & Mathematica.
My question, made gigantic:
What is the most (more than above) efficient (fast, low memory-impact[we can clear it every time, but I am not even aware of what function to investigate for this], low-core-count-requirement) method, given my above code & following intended use, for the export and import of tensor-style (non-traditional dimensions) matrices, or lists-of-lists-of-lists(-of-lists)?
In other words, what is the right format to use in this case, in order to provide for quick export and import of many (~10,000+) files containing matrices of non-traditional tensor-style dimensions, id est, lists-of-lists-of-lists(-of-lists)?
Please note, prior to answering, that this must be a method that is cross-compatible, and appropriate for arbitrary systems, what I mean by this is something like a dual core should be able to load it (assuming enough available memory, of course[if you can solve this too, you are welcome to include it in your answer!]) in a feasible amount of time, thus allowing a 4+ core setup to load it faster, albeit with similarly impacted memory usage.
Here are some other, unmentioned, resources which I found useful, but I am left with a lot of older methods, and nothing that is "current" in such a way that I am unsure if .wdx remains a current format not yet replaced by .wxf or .wdf, what came first the chicken or the egg, or even if .mx is indeed cross-compatible [barring the aforementioned $SystemWordLength] (perhaps we can send around the same dataset and see if it stays consistent(I wonder how the CheckSum changes...don't answer that...but that would be fun wouldn't it?...)):
Relevant References & Resources:
- How do I save a variable or function definition to a file?
- Save expression and load them into another notebook efficiently?
- Fast way to export large amount of data in "Table" format
- How to save all data in all variables so that loading it is fast?
If this seems like an unresearched post, or may be incomplete, I've likely missed something glaring, so please let me know where I can provide clarifications or improvements, as I wish to have a resource for all through this question and subsequent answers, which will alleviate future concerns as to the most ideal method of export and import of lists-of-lists-of-lists(-of-lists) of non-traditional tensor-style dimensions. Also it is very late, and I need to sleep, but I will edit this message of sleep away in the morning tomorrow, in hopes of providing further clarification, or to begin collating time & memory tests.
I've only just stumbled upon the above methods of compression and binary exportation, but the current ability to understand this escapes me....I fear my solution is there, and I may just be posting this only to be marked as a crazy-overdeveloped-duplicate question, but it would be awesome if this method was revitalized here, using the newest methods that have been determined by the community, hint hint...maybe?
Also this is super duper relevant, though it is not cross-compatible...but I need sleep, for real!
Thank you in advance to all who take the time to read through this with a serious eye, I hope that my lack of brevity is not too lax in the way I express my sheer passion and enjoyment of Mathematica & (the) Wolfram Language within this question post. Everyone here is awesome, and I am very excited to see what inputs you all have for this. Enjoy!
performance-tuning export import data data-structures
$endgroup$
add a comment |
$begingroup$
Hello World!
(I did it for my first answer, so it only feels proper...I'm not nostalgic, you are!)
OKAY, here we go!
Let's start off with the good-stuff, here's my code, made relatively arbitrary, as I cannot (yet) share the actual code, though I will be happy to test methods with my real code and provide, in the end, a nice display of the collated methods and their efficacies, with attributions to those who contribute:
YourMainFunction[v1_?NumericQ,v2_?NumericQ,v3_?NumericQ,d1_?NumericQ,n_?IntegerQ]:=
(Export[NoteBookDirectory[]<>ToString[N[v1,8]]<>"v1"<>ToString[N[v2,5]]<>"v2"<>ToString[N[v3,5]]<>"v3.wdx",
Transpose[SortBy[Transpose[Eigensystem[N[f[v1,v2,v3,d1,n]]],Re[#[[1]]]&][[n+1;;2 n]]]);
On the surface, I am running an Eigensystem calculation for a function of 3 arbitrary inputs of numerical values that is size n-by-n, and I output the following format:
v1, v2, v3,eVals,eVecs
It must be mentioned that while the eVals are real, the eVecs are complex, and must be kept this way, if remotely possible.
However, I do realize I can input the parameters into the filename, and only contain the Eigensystem outputs in my exported files, like so:
expr=Transpose[SortBy[Transpose[Eigensystem[N[f[v1,v2,v3]]],Re[#[[1]]]&][[n+1;;2 n]]];
Export[NotebookDirectory[]<>ToString[N[v1,8]]<>"v1"<>ToString[N[v2,5]]<>"v2"<>ToString[N[v3,5]]<>"v3.wdx",expr];
Which resulted in the above coding attempts.
I will do this 10,000 times (100-by-100 combinations of v2 & v3) before I change the v1 value, or vice-versa, as there will be 1 main parameter, whilst the others are cycled through all combinations, I do this in a variety of ways, often either ParallelSubmit(most efficient) or ParallelTable(best for reimportation of ParallelSubmit-created data), but currently on a dual-core machine, I am unable to load the sheer volume of files in any feasible manner that can be completed in the length of a day let alone over a research meeting, and in-fact put too much stress on my hexa-core mobile workstation (though it does work! [albeit slowly]), and I only find comfort in my CPU load percentage when using a 16-core behemoth I keep caged at home.
Each of these are stored separately, as this is best for stopping and starting mid-export. I then export the master list of the filenames of the exported files (this filename is what Export will output), but this can easily be recreated and properly collated if need be. It does not seem advantageous to reassemble the intended set format each time when importing, nor does it aid in the subsequent analysis of these datasets to be required to do so. I would also not like to lose accuracy due to truncated values in the filename, nor have lengthy strings for the filename. While I could investigate Parallel Computing methods using this post, I feel this has been addressed quite well already, and I recognize my error is likely in the method of how I import and export, and what file formats I use, as the used format .wdx is slow, but cross-system compatible.
I know the following:
- .wdx is appropriate for cross-compatibility between systems
.mx is faster, but only able to be loaded on those systems with matching $SystemWordLength
The most efficient method of exporting is using DumpSave and Get, but it doesn't work for pushing out a non-square table, nor is it entirely applicable for compatibility across systems or versions of the WL & Mathematica.
My question, made gigantic:
What is the most (more than above) efficient (fast, low memory-impact[we can clear it every time, but I am not even aware of what function to investigate for this], low-core-count-requirement) method, given my above code & following intended use, for the export and import of tensor-style (non-traditional dimensions) matrices, or lists-of-lists-of-lists(-of-lists)?
In other words, what is the right format to use in this case, in order to provide for quick export and import of many (~10,000+) files containing matrices of non-traditional tensor-style dimensions, id est, lists-of-lists-of-lists(-of-lists)?
Please note, prior to answering, that this must be a method that is cross-compatible, and appropriate for arbitrary systems, what I mean by this is something like a dual core should be able to load it (assuming enough available memory, of course[if you can solve this too, you are welcome to include it in your answer!]) in a feasible amount of time, thus allowing a 4+ core setup to load it faster, albeit with similarly impacted memory usage.
Here are some other, unmentioned, resources which I found useful, but I am left with a lot of older methods, and nothing that is "current" in such a way that I am unsure if .wdx remains a current format not yet replaced by .wxf or .wdf, what came first the chicken or the egg, or even if .mx is indeed cross-compatible [barring the aforementioned $SystemWordLength] (perhaps we can send around the same dataset and see if it stays consistent(I wonder how the CheckSum changes...don't answer that...but that would be fun wouldn't it?...)):
Relevant References & Resources:
- How do I save a variable or function definition to a file?
- Save expression and load them into another notebook efficiently?
- Fast way to export large amount of data in "Table" format
- How to save all data in all variables so that loading it is fast?
If this seems like an unresearched post, or may be incomplete, I've likely missed something glaring, so please let me know where I can provide clarifications or improvements, as I wish to have a resource for all through this question and subsequent answers, which will alleviate future concerns as to the most ideal method of export and import of lists-of-lists-of-lists(-of-lists) of non-traditional tensor-style dimensions. Also it is very late, and I need to sleep, but I will edit this message of sleep away in the morning tomorrow, in hopes of providing further clarification, or to begin collating time & memory tests.
I've only just stumbled upon the above methods of compression and binary exportation, but the current ability to understand this escapes me....I fear my solution is there, and I may just be posting this only to be marked as a crazy-overdeveloped-duplicate question, but it would be awesome if this method was revitalized here, using the newest methods that have been determined by the community, hint hint...maybe?
Also this is super duper relevant, though it is not cross-compatible...but I need sleep, for real!
Thank you in advance to all who take the time to read through this with a serious eye, I hope that my lack of brevity is not too lax in the way I express my sheer passion and enjoyment of Mathematica & (the) Wolfram Language within this question post. Everyone here is awesome, and I am very excited to see what inputs you all have for this. Enjoy!
performance-tuning export import data data-structures
$endgroup$
Hello World!
(I did it for my first answer, so it only feels proper...I'm not nostalgic, you are!)
OKAY, here we go!
Let's start off with the good-stuff, here's my code, made relatively arbitrary, as I cannot (yet) share the actual code, though I will be happy to test methods with my real code and provide, in the end, a nice display of the collated methods and their efficacies, with attributions to those who contribute:
YourMainFunction[v1_?NumericQ,v2_?NumericQ,v3_?NumericQ,d1_?NumericQ,n_?IntegerQ]:=
(Export[NoteBookDirectory[]<>ToString[N[v1,8]]<>"v1"<>ToString[N[v2,5]]<>"v2"<>ToString[N[v3,5]]<>"v3.wdx",
Transpose[SortBy[Transpose[Eigensystem[N[f[v1,v2,v3,d1,n]]],Re[#[[1]]]&][[n+1;;2 n]]]);
On the surface, I am running an Eigensystem calculation for a function of 3 arbitrary inputs of numerical values that is size n-by-n, and I output the following format:
v1, v2, v3,eVals,eVecs
It must be mentioned that while the eVals are real, the eVecs are complex, and must be kept this way, if remotely possible.
However, I do realize I can input the parameters into the filename, and only contain the Eigensystem outputs in my exported files, like so:
expr=Transpose[SortBy[Transpose[Eigensystem[N[f[v1,v2,v3]]],Re[#[[1]]]&][[n+1;;2 n]]];
Export[NotebookDirectory[]<>ToString[N[v1,8]]<>"v1"<>ToString[N[v2,5]]<>"v2"<>ToString[N[v3,5]]<>"v3.wdx",expr];
Which resulted in the above coding attempts.
I will do this 10,000 times (100-by-100 combinations of v2 & v3) before I change the v1 value, or vice-versa, as there will be 1 main parameter, whilst the others are cycled through all combinations, I do this in a variety of ways, often either ParallelSubmit(most efficient) or ParallelTable(best for reimportation of ParallelSubmit-created data), but currently on a dual-core machine, I am unable to load the sheer volume of files in any feasible manner that can be completed in the length of a day let alone over a research meeting, and in-fact put too much stress on my hexa-core mobile workstation (though it does work! [albeit slowly]), and I only find comfort in my CPU load percentage when using a 16-core behemoth I keep caged at home.
Each of these are stored separately, as this is best for stopping and starting mid-export. I then export the master list of the filenames of the exported files (this filename is what Export will output), but this can easily be recreated and properly collated if need be. It does not seem advantageous to reassemble the intended set format each time when importing, nor does it aid in the subsequent analysis of these datasets to be required to do so. I would also not like to lose accuracy due to truncated values in the filename, nor have lengthy strings for the filename. While I could investigate Parallel Computing methods using this post, I feel this has been addressed quite well already, and I recognize my error is likely in the method of how I import and export, and what file formats I use, as the used format .wdx is slow, but cross-system compatible.
I know the following:
- .wdx is appropriate for cross-compatibility between systems
.mx is faster, but only able to be loaded on those systems with matching $SystemWordLength
The most efficient method of exporting is using DumpSave and Get, but it doesn't work for pushing out a non-square table, nor is it entirely applicable for compatibility across systems or versions of the WL & Mathematica.
My question, made gigantic:
What is the most (more than above) efficient (fast, low memory-impact[we can clear it every time, but I am not even aware of what function to investigate for this], low-core-count-requirement) method, given my above code & following intended use, for the export and import of tensor-style (non-traditional dimensions) matrices, or lists-of-lists-of-lists(-of-lists)?
In other words, what is the right format to use in this case, in order to provide for quick export and import of many (~10,000+) files containing matrices of non-traditional tensor-style dimensions, id est, lists-of-lists-of-lists(-of-lists)?
Please note, prior to answering, that this must be a method that is cross-compatible, and appropriate for arbitrary systems, what I mean by this is something like a dual core should be able to load it (assuming enough available memory, of course[if you can solve this too, you are welcome to include it in your answer!]) in a feasible amount of time, thus allowing a 4+ core setup to load it faster, albeit with similarly impacted memory usage.
Here are some other, unmentioned, resources which I found useful, but I am left with a lot of older methods, and nothing that is "current" in such a way that I am unsure if .wdx remains a current format not yet replaced by .wxf or .wdf, what came first the chicken or the egg, or even if .mx is indeed cross-compatible [barring the aforementioned $SystemWordLength] (perhaps we can send around the same dataset and see if it stays consistent(I wonder how the CheckSum changes...don't answer that...but that would be fun wouldn't it?...)):
Relevant References & Resources:
- How do I save a variable or function definition to a file?
- Save expression and load them into another notebook efficiently?
- Fast way to export large amount of data in "Table" format
- How to save all data in all variables so that loading it is fast?
If this seems like an unresearched post, or may be incomplete, I've likely missed something glaring, so please let me know where I can provide clarifications or improvements, as I wish to have a resource for all through this question and subsequent answers, which will alleviate future concerns as to the most ideal method of export and import of lists-of-lists-of-lists(-of-lists) of non-traditional tensor-style dimensions. Also it is very late, and I need to sleep, but I will edit this message of sleep away in the morning tomorrow, in hopes of providing further clarification, or to begin collating time & memory tests.
I've only just stumbled upon the above methods of compression and binary exportation, but the current ability to understand this escapes me....I fear my solution is there, and I may just be posting this only to be marked as a crazy-overdeveloped-duplicate question, but it would be awesome if this method was revitalized here, using the newest methods that have been determined by the community, hint hint...maybe?
Also this is super duper relevant, though it is not cross-compatible...but I need sleep, for real!
Thank you in advance to all who take the time to read through this with a serious eye, I hope that my lack of brevity is not too lax in the way I express my sheer passion and enjoyment of Mathematica & (the) Wolfram Language within this question post. Everyone here is awesome, and I am very excited to see what inputs you all have for this. Enjoy!
performance-tuning export import data data-structures
performance-tuning export import data data-structures
asked May 15 at 5:57
CA TrevillianCA Trevillian
375110
375110
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
You might want to give the HDF5 format a try. It seems to be very efficient, even faster than MX in the example below.
One has only to be careful to use the option "ComplexKeys" -> "Re", "Im"
upon import (otherwise, each complex number is split into an association containing real and imginary part, rendering the method very inefficient).
Export:
n = 1000;
A = RandomReal[-1, 1, n, n];
λ, U = Eigensystem[A];
Export["a.h5",
"Eigenvalues" -> λ,
"Eigenvectors" -> U
, "Datasets"] // AbsoluteTiming
0.019481, "a.h5"
Import:
μ = Import["a.h5", "Data", "/Eigenvalues",
"ComplexKeys" -> "Re", "Im"]; // AbsoluteTiming // First
V = Import["a.h5", "Data", "/Eigenvectors",
"ComplexKeys" -> "Re", "Im"]; // AbsoluteTiming // First
0.011885
0.01822
Check:
Max[Abs[λ - μ]]
Max[Abs[U - V]]
0.
0.
$endgroup$
$begingroup$
Wow! I’m going to learn a new data format today, this is awesome from first glance! And exactly who I hoped to hear from!! I’ll make my first run of timing tests and hope to add them this evening. I always find your posts to be immensely helpful, and this mention of associations is another skill I must gain. Thank you additionally for proving your method by a series of import and export, I presume hdf5 is not a Mathematica exclusive format, which may prove to make it even more universal? I’m curious as to how small the files are! This is exciting, you might have another winner already ;)
$endgroup$
– CA Trevillian
May 15 at 15:18
1
$begingroup$
I am glad to hear that you find my post stimulating. And yes, as far as I know, HDF5 is meant to be a universal and efficient file format.
$endgroup$
– Henrik Schumacher
May 15 at 15:33
2
$begingroup$
+1 for HDF5, a very useful format. Used commonly by space agencies and geospatial organizations the world over :) Also very portable -h5py
for Python,h5
for R, etc.
$endgroup$
– Carl Lange
May 21 at 13:30
$begingroup$
Admittedly I haven’t implemented this yet, as I’ll have to reproduce my datas (laziest method) but I have been studying this, please, @CarlLange and Henrik, check my understanding: I can output my Eigenvectors and Eigenvalues with this format, this is good, then I would formulate my datasets such that I have a group according to the parameters I am using? Ie something like...”/v1”,”/v2”,”/v3”...
?
$endgroup$
– CA Trevillian
May 21 at 13:59
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "387"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathematica.stackexchange.com%2fquestions%2f198372%2fmost-efficient-method-for-export-import-of-non-traditional-tensor-style-dimensio%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
You might want to give the HDF5 format a try. It seems to be very efficient, even faster than MX in the example below.
One has only to be careful to use the option "ComplexKeys" -> "Re", "Im"
upon import (otherwise, each complex number is split into an association containing real and imginary part, rendering the method very inefficient).
Export:
n = 1000;
A = RandomReal[-1, 1, n, n];
λ, U = Eigensystem[A];
Export["a.h5",
"Eigenvalues" -> λ,
"Eigenvectors" -> U
, "Datasets"] // AbsoluteTiming
0.019481, "a.h5"
Import:
μ = Import["a.h5", "Data", "/Eigenvalues",
"ComplexKeys" -> "Re", "Im"]; // AbsoluteTiming // First
V = Import["a.h5", "Data", "/Eigenvectors",
"ComplexKeys" -> "Re", "Im"]; // AbsoluteTiming // First
0.011885
0.01822
Check:
Max[Abs[λ - μ]]
Max[Abs[U - V]]
0.
0.
$endgroup$
$begingroup$
Wow! I’m going to learn a new data format today, this is awesome from first glance! And exactly who I hoped to hear from!! I’ll make my first run of timing tests and hope to add them this evening. I always find your posts to be immensely helpful, and this mention of associations is another skill I must gain. Thank you additionally for proving your method by a series of import and export, I presume hdf5 is not a Mathematica exclusive format, which may prove to make it even more universal? I’m curious as to how small the files are! This is exciting, you might have another winner already ;)
$endgroup$
– CA Trevillian
May 15 at 15:18
1
$begingroup$
I am glad to hear that you find my post stimulating. And yes, as far as I know, HDF5 is meant to be a universal and efficient file format.
$endgroup$
– Henrik Schumacher
May 15 at 15:33
2
$begingroup$
+1 for HDF5, a very useful format. Used commonly by space agencies and geospatial organizations the world over :) Also very portable -h5py
for Python,h5
for R, etc.
$endgroup$
– Carl Lange
May 21 at 13:30
$begingroup$
Admittedly I haven’t implemented this yet, as I’ll have to reproduce my datas (laziest method) but I have been studying this, please, @CarlLange and Henrik, check my understanding: I can output my Eigenvectors and Eigenvalues with this format, this is good, then I would formulate my datasets such that I have a group according to the parameters I am using? Ie something like...”/v1”,”/v2”,”/v3”...
?
$endgroup$
– CA Trevillian
May 21 at 13:59
add a comment |
$begingroup$
You might want to give the HDF5 format a try. It seems to be very efficient, even faster than MX in the example below.
One has only to be careful to use the option "ComplexKeys" -> "Re", "Im"
upon import (otherwise, each complex number is split into an association containing real and imginary part, rendering the method very inefficient).
Export:
n = 1000;
A = RandomReal[-1, 1, n, n];
λ, U = Eigensystem[A];
Export["a.h5",
"Eigenvalues" -> λ,
"Eigenvectors" -> U
, "Datasets"] // AbsoluteTiming
0.019481, "a.h5"
Import:
μ = Import["a.h5", "Data", "/Eigenvalues",
"ComplexKeys" -> "Re", "Im"]; // AbsoluteTiming // First
V = Import["a.h5", "Data", "/Eigenvectors",
"ComplexKeys" -> "Re", "Im"]; // AbsoluteTiming // First
0.011885
0.01822
Check:
Max[Abs[λ - μ]]
Max[Abs[U - V]]
0.
0.
$endgroup$
$begingroup$
Wow! I’m going to learn a new data format today, this is awesome from first glance! And exactly who I hoped to hear from!! I’ll make my first run of timing tests and hope to add them this evening. I always find your posts to be immensely helpful, and this mention of associations is another skill I must gain. Thank you additionally for proving your method by a series of import and export, I presume hdf5 is not a Mathematica exclusive format, which may prove to make it even more universal? I’m curious as to how small the files are! This is exciting, you might have another winner already ;)
$endgroup$
– CA Trevillian
May 15 at 15:18
1
$begingroup$
I am glad to hear that you find my post stimulating. And yes, as far as I know, HDF5 is meant to be a universal and efficient file format.
$endgroup$
– Henrik Schumacher
May 15 at 15:33
2
$begingroup$
+1 for HDF5, a very useful format. Used commonly by space agencies and geospatial organizations the world over :) Also very portable -h5py
for Python,h5
for R, etc.
$endgroup$
– Carl Lange
May 21 at 13:30
$begingroup$
Admittedly I haven’t implemented this yet, as I’ll have to reproduce my datas (laziest method) but I have been studying this, please, @CarlLange and Henrik, check my understanding: I can output my Eigenvectors and Eigenvalues with this format, this is good, then I would formulate my datasets such that I have a group according to the parameters I am using? Ie something like...”/v1”,”/v2”,”/v3”...
?
$endgroup$
– CA Trevillian
May 21 at 13:59
add a comment |
$begingroup$
You might want to give the HDF5 format a try. It seems to be very efficient, even faster than MX in the example below.
One has only to be careful to use the option "ComplexKeys" -> "Re", "Im"
upon import (otherwise, each complex number is split into an association containing real and imginary part, rendering the method very inefficient).
Export:
n = 1000;
A = RandomReal[-1, 1, n, n];
λ, U = Eigensystem[A];
Export["a.h5",
"Eigenvalues" -> λ,
"Eigenvectors" -> U
, "Datasets"] // AbsoluteTiming
0.019481, "a.h5"
Import:
μ = Import["a.h5", "Data", "/Eigenvalues",
"ComplexKeys" -> "Re", "Im"]; // AbsoluteTiming // First
V = Import["a.h5", "Data", "/Eigenvectors",
"ComplexKeys" -> "Re", "Im"]; // AbsoluteTiming // First
0.011885
0.01822
Check:
Max[Abs[λ - μ]]
Max[Abs[U - V]]
0.
0.
$endgroup$
You might want to give the HDF5 format a try. It seems to be very efficient, even faster than MX in the example below.
One has only to be careful to use the option "ComplexKeys" -> "Re", "Im"
upon import (otherwise, each complex number is split into an association containing real and imginary part, rendering the method very inefficient).
Export:
n = 1000;
A = RandomReal[-1, 1, n, n];
λ, U = Eigensystem[A];
Export["a.h5",
"Eigenvalues" -> λ,
"Eigenvectors" -> U
, "Datasets"] // AbsoluteTiming
0.019481, "a.h5"
Import:
μ = Import["a.h5", "Data", "/Eigenvalues",
"ComplexKeys" -> "Re", "Im"]; // AbsoluteTiming // First
V = Import["a.h5", "Data", "/Eigenvectors",
"ComplexKeys" -> "Re", "Im"]; // AbsoluteTiming // First
0.011885
0.01822
Check:
Max[Abs[λ - μ]]
Max[Abs[U - V]]
0.
0.
edited May 15 at 15:33
answered May 15 at 6:39
Henrik SchumacherHenrik Schumacher
63.2k587176
63.2k587176
$begingroup$
Wow! I’m going to learn a new data format today, this is awesome from first glance! And exactly who I hoped to hear from!! I’ll make my first run of timing tests and hope to add them this evening. I always find your posts to be immensely helpful, and this mention of associations is another skill I must gain. Thank you additionally for proving your method by a series of import and export, I presume hdf5 is not a Mathematica exclusive format, which may prove to make it even more universal? I’m curious as to how small the files are! This is exciting, you might have another winner already ;)
$endgroup$
– CA Trevillian
May 15 at 15:18
1
$begingroup$
I am glad to hear that you find my post stimulating. And yes, as far as I know, HDF5 is meant to be a universal and efficient file format.
$endgroup$
– Henrik Schumacher
May 15 at 15:33
2
$begingroup$
+1 for HDF5, a very useful format. Used commonly by space agencies and geospatial organizations the world over :) Also very portable -h5py
for Python,h5
for R, etc.
$endgroup$
– Carl Lange
May 21 at 13:30
$begingroup$
Admittedly I haven’t implemented this yet, as I’ll have to reproduce my datas (laziest method) but I have been studying this, please, @CarlLange and Henrik, check my understanding: I can output my Eigenvectors and Eigenvalues with this format, this is good, then I would formulate my datasets such that I have a group according to the parameters I am using? Ie something like...”/v1”,”/v2”,”/v3”...
?
$endgroup$
– CA Trevillian
May 21 at 13:59
add a comment |
$begingroup$
Wow! I’m going to learn a new data format today, this is awesome from first glance! And exactly who I hoped to hear from!! I’ll make my first run of timing tests and hope to add them this evening. I always find your posts to be immensely helpful, and this mention of associations is another skill I must gain. Thank you additionally for proving your method by a series of import and export, I presume hdf5 is not a Mathematica exclusive format, which may prove to make it even more universal? I’m curious as to how small the files are! This is exciting, you might have another winner already ;)
$endgroup$
– CA Trevillian
May 15 at 15:18
1
$begingroup$
I am glad to hear that you find my post stimulating. And yes, as far as I know, HDF5 is meant to be a universal and efficient file format.
$endgroup$
– Henrik Schumacher
May 15 at 15:33
2
$begingroup$
+1 for HDF5, a very useful format. Used commonly by space agencies and geospatial organizations the world over :) Also very portable -h5py
for Python,h5
for R, etc.
$endgroup$
– Carl Lange
May 21 at 13:30
$begingroup$
Admittedly I haven’t implemented this yet, as I’ll have to reproduce my datas (laziest method) but I have been studying this, please, @CarlLange and Henrik, check my understanding: I can output my Eigenvectors and Eigenvalues with this format, this is good, then I would formulate my datasets such that I have a group according to the parameters I am using? Ie something like...”/v1”,”/v2”,”/v3”...
?
$endgroup$
– CA Trevillian
May 21 at 13:59
$begingroup$
Wow! I’m going to learn a new data format today, this is awesome from first glance! And exactly who I hoped to hear from!! I’ll make my first run of timing tests and hope to add them this evening. I always find your posts to be immensely helpful, and this mention of associations is another skill I must gain. Thank you additionally for proving your method by a series of import and export, I presume hdf5 is not a Mathematica exclusive format, which may prove to make it even more universal? I’m curious as to how small the files are! This is exciting, you might have another winner already ;)
$endgroup$
– CA Trevillian
May 15 at 15:18
$begingroup$
Wow! I’m going to learn a new data format today, this is awesome from first glance! And exactly who I hoped to hear from!! I’ll make my first run of timing tests and hope to add them this evening. I always find your posts to be immensely helpful, and this mention of associations is another skill I must gain. Thank you additionally for proving your method by a series of import and export, I presume hdf5 is not a Mathematica exclusive format, which may prove to make it even more universal? I’m curious as to how small the files are! This is exciting, you might have another winner already ;)
$endgroup$
– CA Trevillian
May 15 at 15:18
1
1
$begingroup$
I am glad to hear that you find my post stimulating. And yes, as far as I know, HDF5 is meant to be a universal and efficient file format.
$endgroup$
– Henrik Schumacher
May 15 at 15:33
$begingroup$
I am glad to hear that you find my post stimulating. And yes, as far as I know, HDF5 is meant to be a universal and efficient file format.
$endgroup$
– Henrik Schumacher
May 15 at 15:33
2
2
$begingroup$
+1 for HDF5, a very useful format. Used commonly by space agencies and geospatial organizations the world over :) Also very portable -
h5py
for Python, h5
for R, etc.$endgroup$
– Carl Lange
May 21 at 13:30
$begingroup$
+1 for HDF5, a very useful format. Used commonly by space agencies and geospatial organizations the world over :) Also very portable -
h5py
for Python, h5
for R, etc.$endgroup$
– Carl Lange
May 21 at 13:30
$begingroup$
Admittedly I haven’t implemented this yet, as I’ll have to reproduce my datas (laziest method) but I have been studying this, please, @CarlLange and Henrik, check my understanding: I can output my Eigenvectors and Eigenvalues with this format, this is good, then I would formulate my datasets such that I have a group according to the parameters I am using? Ie something like
...”/v1”,”/v2”,”/v3”...
?$endgroup$
– CA Trevillian
May 21 at 13:59
$begingroup$
Admittedly I haven’t implemented this yet, as I’ll have to reproduce my datas (laziest method) but I have been studying this, please, @CarlLange and Henrik, check my understanding: I can output my Eigenvectors and Eigenvalues with this format, this is good, then I would formulate my datasets such that I have a group according to the parameters I am using? Ie something like
...”/v1”,”/v2”,”/v3”...
?$endgroup$
– CA Trevillian
May 21 at 13:59
add a comment |
Thanks for contributing an answer to Mathematica Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathematica.stackexchange.com%2fquestions%2f198372%2fmost-efficient-method-for-export-import-of-non-traditional-tensor-style-dimensio%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown