Most efficient method for export/import of non-traditional tensor-style dimensions of matrices/tables/lists-of-lists-of-lists(-of lists)?How do I save a variable or function definition to a file?How to save all data in all variables so that loading it is fast?Fast way to export large amount of data in “Table” formatExport of (large) Dataset lasts very longHow to develop an Import/Export converter for Compress[]ed data?Exporting lists with two extensions to filesRunning out of memory when filtering a large data setImporting several huge files with headers & footersSaving and loading lists of interpolating functions fast and cross-platformWhat's the most efficient way to batch-import, batch-process and batch-export images?How to export nontrivial data to h5?Warning? Export [“…mx”] not downward compatible from Mathematica 11.1.1Specifying that LocalObject should use MXimport list of lists list-wise

I think I may have violated academic integrity last year - what should I do?

How is character development a major role in the plot of a story

Crossword gone overboard

Is CD audio quality good enough for the final delivery of music?

Why did this prime-sequence puzzle not work?

How do you deal with an abrupt change in personality for a protagonist?

How do I subvert the tropes of a train heist?

What are the problems in teaching guitar via Skype?

What is the most important source of natural gas? coal, oil or other?

How to prevent bad sectors?

How to capture more stars?

Is there any use case for the bottom type as a function parameter type?

How to extract lower and upper bound in numeric format from a confidence interval string?

Future enhancements for the finite element method

The Passive Wisdom (Perception) score of my character on D&D Beyond seems too high

What is the best linguistic term for describing the kw > p / gw > b change, and its usual companion s > h

How can I prevent interns from being expendable?

Modern approach to radio buttons

How does apt-get work, in detail?

1960s sci-fi novella with a character who is treated as invisible by being ignored

What are the benefits of cryosleep?

Terminology about G- simplicial complexes

Plot exactly N bounce of a ball

Preserving culinary oils



Most efficient method for export/import of non-traditional tensor-style dimensions of matrices/tables/lists-of-lists-of-lists(-of lists)?


How do I save a variable or function definition to a file?How to save all data in all variables so that loading it is fast?Fast way to export large amount of data in “Table” formatExport of (large) Dataset lasts very longHow to develop an Import/Export converter for Compress[]ed data?Exporting lists with two extensions to filesRunning out of memory when filtering a large data setImporting several huge files with headers & footersSaving and loading lists of interpolating functions fast and cross-platformWhat's the most efficient way to batch-import, batch-process and batch-export images?How to export nontrivial data to h5?Warning? Export [“…mx”] not downward compatible from Mathematica 11.1.1Specifying that LocalObject should use MXimport list of lists list-wise













8












$begingroup$


Hello World!



(I did it for my first answer, so it only feels proper...I'm not nostalgic, you are!)



OKAY, here we go!



Let's start off with the good-stuff, here's my code, made relatively arbitrary, as I cannot (yet) share the actual code, though I will be happy to test methods with my real code and provide, in the end, a nice display of the collated methods and their efficacies, with attributions to those who contribute:



YourMainFunction[v1_?NumericQ,v2_?NumericQ,v3_?NumericQ,d1_?NumericQ,n_?IntegerQ]:=
(Export[NoteBookDirectory[]<>ToString[N[v1,8]]<>"v1"<>ToString[N[v2,5]]<>"v2"<>ToString[N[v3,5]]<>"v3.wdx",
Transpose[SortBy[Transpose[Eigensystem[N[f[v1,v2,v3,d1,n]]],Re[#[[1]]]&][[n+1;;2 n]]]);


On the surface, I am running an Eigensystem calculation for a function of 3 arbitrary inputs of numerical values that is size n-by-n, and I output the following format:



v1, v2, v3,eVals,eVecs


It must be mentioned that while the eVals are real, the eVecs are complex, and must be kept this way, if remotely possible.



However, I do realize I can input the parameters into the filename, and only contain the Eigensystem outputs in my exported files, like so:



expr=Transpose[SortBy[Transpose[Eigensystem[N[f[v1,v2,v3]]],Re[#[[1]]]&][[n+1;;2 n]]];
Export[NotebookDirectory[]<>ToString[N[v1,8]]<>"v1"<>ToString[N[v2,5]]<>"v2"<>ToString[N[v3,5]]<>"v3.wdx",expr];


Which resulted in the above coding attempts.



I will do this 10,000 times (100-by-100 combinations of v2 & v3) before I change the v1 value, or vice-versa, as there will be 1 main parameter, whilst the others are cycled through all combinations, I do this in a variety of ways, often either ParallelSubmit(most efficient) or ParallelTable(best for reimportation of ParallelSubmit-created data), but currently on a dual-core machine, I am unable to load the sheer volume of files in any feasible manner that can be completed in the length of a day let alone over a research meeting, and in-fact put too much stress on my hexa-core mobile workstation (though it does work! [albeit slowly]), and I only find comfort in my CPU load percentage when using a 16-core behemoth I keep caged at home.



Each of these are stored separately, as this is best for stopping and starting mid-export. I then export the master list of the filenames of the exported files (this filename is what Export will output), but this can easily be recreated and properly collated if need be. It does not seem advantageous to reassemble the intended set format each time when importing, nor does it aid in the subsequent analysis of these datasets to be required to do so. I would also not like to lose accuracy due to truncated values in the filename, nor have lengthy strings for the filename. While I could investigate Parallel Computing methods using this post, I feel this has been addressed quite well already, and I recognize my error is likely in the method of how I import and export, and what file formats I use, as the used format .wdx is slow, but cross-system compatible.



I know the following:



  • .wdx is appropriate for cross-compatibility between systems


  • .mx is faster, but only able to be loaded on those systems with matching $SystemWordLength


  • The most efficient method of exporting is using DumpSave and Get, but it doesn't work for pushing out a non-square table, nor is it entirely applicable for compatibility across systems or versions of the WL & Mathematica.

My question, made gigantic:



What is the most (more than above) efficient (fast, low memory-impact[we can clear it every time, but I am not even aware of what function to investigate for this], low-core-count-requirement) method, given my above code & following intended use, for the export and import of tensor-style (non-traditional dimensions) matrices, or lists-of-lists-of-lists(-of-lists)?



In other words, what is the right format to use in this case, in order to provide for quick export and import of many (~10,000+) files containing matrices of non-traditional tensor-style dimensions, id est, lists-of-lists-of-lists(-of-lists)?



Please note, prior to answering, that this must be a method that is cross-compatible, and appropriate for arbitrary systems, what I mean by this is something like a dual core should be able to load it (assuming enough available memory, of course[if you can solve this too, you are welcome to include it in your answer!]) in a feasible amount of time, thus allowing a 4+ core setup to load it faster, albeit with similarly impacted memory usage.



Here are some other, unmentioned, resources which I found useful, but I am left with a lot of older methods, and nothing that is "current" in such a way that I am unsure if .wdx remains a current format not yet replaced by .wxf or .wdf, what came first the chicken or the egg, or even if .mx is indeed cross-compatible [barring the aforementioned $SystemWordLength] (perhaps we can send around the same dataset and see if it stays consistent(I wonder how the CheckSum changes...don't answer that...but that would be fun wouldn't it?...)):



Relevant References & Resources:



  • How do I save a variable or function definition to a file?

  • Save expression and load them into another notebook efficiently?

  • Fast way to export large amount of data in "Table" format

  • How to save all data in all variables so that loading it is fast?

If this seems like an unresearched post, or may be incomplete, I've likely missed something glaring, so please let me know where I can provide clarifications or improvements, as I wish to have a resource for all through this question and subsequent answers, which will alleviate future concerns as to the most ideal method of export and import of lists-of-lists-of-lists(-of-lists) of non-traditional tensor-style dimensions. Also it is very late, and I need to sleep, but I will edit this message of sleep away in the morning tomorrow, in hopes of providing further clarification, or to begin collating time & memory tests.



I've only just stumbled upon the above methods of compression and binary exportation, but the current ability to understand this escapes me....I fear my solution is there, and I may just be posting this only to be marked as a crazy-overdeveloped-duplicate question, but it would be awesome if this method was revitalized here, using the newest methods that have been determined by the community, hint hint...maybe?



Also this is super duper relevant, though it is not cross-compatible...but I need sleep, for real!



Thank you in advance to all who take the time to read through this with a serious eye, I hope that my lack of brevity is not too lax in the way I express my sheer passion and enjoyment of Mathematica & (the) Wolfram Language within this question post. Everyone here is awesome, and I am very excited to see what inputs you all have for this. Enjoy!










share|improve this question









$endgroup$
















    8












    $begingroup$


    Hello World!



    (I did it for my first answer, so it only feels proper...I'm not nostalgic, you are!)



    OKAY, here we go!



    Let's start off with the good-stuff, here's my code, made relatively arbitrary, as I cannot (yet) share the actual code, though I will be happy to test methods with my real code and provide, in the end, a nice display of the collated methods and their efficacies, with attributions to those who contribute:



    YourMainFunction[v1_?NumericQ,v2_?NumericQ,v3_?NumericQ,d1_?NumericQ,n_?IntegerQ]:=
    (Export[NoteBookDirectory[]<>ToString[N[v1,8]]<>"v1"<>ToString[N[v2,5]]<>"v2"<>ToString[N[v3,5]]<>"v3.wdx",
    Transpose[SortBy[Transpose[Eigensystem[N[f[v1,v2,v3,d1,n]]],Re[#[[1]]]&][[n+1;;2 n]]]);


    On the surface, I am running an Eigensystem calculation for a function of 3 arbitrary inputs of numerical values that is size n-by-n, and I output the following format:



    v1, v2, v3,eVals,eVecs


    It must be mentioned that while the eVals are real, the eVecs are complex, and must be kept this way, if remotely possible.



    However, I do realize I can input the parameters into the filename, and only contain the Eigensystem outputs in my exported files, like so:



    expr=Transpose[SortBy[Transpose[Eigensystem[N[f[v1,v2,v3]]],Re[#[[1]]]&][[n+1;;2 n]]];
    Export[NotebookDirectory[]<>ToString[N[v1,8]]<>"v1"<>ToString[N[v2,5]]<>"v2"<>ToString[N[v3,5]]<>"v3.wdx",expr];


    Which resulted in the above coding attempts.



    I will do this 10,000 times (100-by-100 combinations of v2 & v3) before I change the v1 value, or vice-versa, as there will be 1 main parameter, whilst the others are cycled through all combinations, I do this in a variety of ways, often either ParallelSubmit(most efficient) or ParallelTable(best for reimportation of ParallelSubmit-created data), but currently on a dual-core machine, I am unable to load the sheer volume of files in any feasible manner that can be completed in the length of a day let alone over a research meeting, and in-fact put too much stress on my hexa-core mobile workstation (though it does work! [albeit slowly]), and I only find comfort in my CPU load percentage when using a 16-core behemoth I keep caged at home.



    Each of these are stored separately, as this is best for stopping and starting mid-export. I then export the master list of the filenames of the exported files (this filename is what Export will output), but this can easily be recreated and properly collated if need be. It does not seem advantageous to reassemble the intended set format each time when importing, nor does it aid in the subsequent analysis of these datasets to be required to do so. I would also not like to lose accuracy due to truncated values in the filename, nor have lengthy strings for the filename. While I could investigate Parallel Computing methods using this post, I feel this has been addressed quite well already, and I recognize my error is likely in the method of how I import and export, and what file formats I use, as the used format .wdx is slow, but cross-system compatible.



    I know the following:



    • .wdx is appropriate for cross-compatibility between systems


    • .mx is faster, but only able to be loaded on those systems with matching $SystemWordLength


    • The most efficient method of exporting is using DumpSave and Get, but it doesn't work for pushing out a non-square table, nor is it entirely applicable for compatibility across systems or versions of the WL & Mathematica.

    My question, made gigantic:



    What is the most (more than above) efficient (fast, low memory-impact[we can clear it every time, but I am not even aware of what function to investigate for this], low-core-count-requirement) method, given my above code & following intended use, for the export and import of tensor-style (non-traditional dimensions) matrices, or lists-of-lists-of-lists(-of-lists)?



    In other words, what is the right format to use in this case, in order to provide for quick export and import of many (~10,000+) files containing matrices of non-traditional tensor-style dimensions, id est, lists-of-lists-of-lists(-of-lists)?



    Please note, prior to answering, that this must be a method that is cross-compatible, and appropriate for arbitrary systems, what I mean by this is something like a dual core should be able to load it (assuming enough available memory, of course[if you can solve this too, you are welcome to include it in your answer!]) in a feasible amount of time, thus allowing a 4+ core setup to load it faster, albeit with similarly impacted memory usage.



    Here are some other, unmentioned, resources which I found useful, but I am left with a lot of older methods, and nothing that is "current" in such a way that I am unsure if .wdx remains a current format not yet replaced by .wxf or .wdf, what came first the chicken or the egg, or even if .mx is indeed cross-compatible [barring the aforementioned $SystemWordLength] (perhaps we can send around the same dataset and see if it stays consistent(I wonder how the CheckSum changes...don't answer that...but that would be fun wouldn't it?...)):



    Relevant References & Resources:



    • How do I save a variable or function definition to a file?

    • Save expression and load them into another notebook efficiently?

    • Fast way to export large amount of data in "Table" format

    • How to save all data in all variables so that loading it is fast?

    If this seems like an unresearched post, or may be incomplete, I've likely missed something glaring, so please let me know where I can provide clarifications or improvements, as I wish to have a resource for all through this question and subsequent answers, which will alleviate future concerns as to the most ideal method of export and import of lists-of-lists-of-lists(-of-lists) of non-traditional tensor-style dimensions. Also it is very late, and I need to sleep, but I will edit this message of sleep away in the morning tomorrow, in hopes of providing further clarification, or to begin collating time & memory tests.



    I've only just stumbled upon the above methods of compression and binary exportation, but the current ability to understand this escapes me....I fear my solution is there, and I may just be posting this only to be marked as a crazy-overdeveloped-duplicate question, but it would be awesome if this method was revitalized here, using the newest methods that have been determined by the community, hint hint...maybe?



    Also this is super duper relevant, though it is not cross-compatible...but I need sleep, for real!



    Thank you in advance to all who take the time to read through this with a serious eye, I hope that my lack of brevity is not too lax in the way I express my sheer passion and enjoyment of Mathematica & (the) Wolfram Language within this question post. Everyone here is awesome, and I am very excited to see what inputs you all have for this. Enjoy!










    share|improve this question









    $endgroup$














      8












      8








      8


      1



      $begingroup$


      Hello World!



      (I did it for my first answer, so it only feels proper...I'm not nostalgic, you are!)



      OKAY, here we go!



      Let's start off with the good-stuff, here's my code, made relatively arbitrary, as I cannot (yet) share the actual code, though I will be happy to test methods with my real code and provide, in the end, a nice display of the collated methods and their efficacies, with attributions to those who contribute:



      YourMainFunction[v1_?NumericQ,v2_?NumericQ,v3_?NumericQ,d1_?NumericQ,n_?IntegerQ]:=
      (Export[NoteBookDirectory[]<>ToString[N[v1,8]]<>"v1"<>ToString[N[v2,5]]<>"v2"<>ToString[N[v3,5]]<>"v3.wdx",
      Transpose[SortBy[Transpose[Eigensystem[N[f[v1,v2,v3,d1,n]]],Re[#[[1]]]&][[n+1;;2 n]]]);


      On the surface, I am running an Eigensystem calculation for a function of 3 arbitrary inputs of numerical values that is size n-by-n, and I output the following format:



      v1, v2, v3,eVals,eVecs


      It must be mentioned that while the eVals are real, the eVecs are complex, and must be kept this way, if remotely possible.



      However, I do realize I can input the parameters into the filename, and only contain the Eigensystem outputs in my exported files, like so:



      expr=Transpose[SortBy[Transpose[Eigensystem[N[f[v1,v2,v3]]],Re[#[[1]]]&][[n+1;;2 n]]];
      Export[NotebookDirectory[]<>ToString[N[v1,8]]<>"v1"<>ToString[N[v2,5]]<>"v2"<>ToString[N[v3,5]]<>"v3.wdx",expr];


      Which resulted in the above coding attempts.



      I will do this 10,000 times (100-by-100 combinations of v2 & v3) before I change the v1 value, or vice-versa, as there will be 1 main parameter, whilst the others are cycled through all combinations, I do this in a variety of ways, often either ParallelSubmit(most efficient) or ParallelTable(best for reimportation of ParallelSubmit-created data), but currently on a dual-core machine, I am unable to load the sheer volume of files in any feasible manner that can be completed in the length of a day let alone over a research meeting, and in-fact put too much stress on my hexa-core mobile workstation (though it does work! [albeit slowly]), and I only find comfort in my CPU load percentage when using a 16-core behemoth I keep caged at home.



      Each of these are stored separately, as this is best for stopping and starting mid-export. I then export the master list of the filenames of the exported files (this filename is what Export will output), but this can easily be recreated and properly collated if need be. It does not seem advantageous to reassemble the intended set format each time when importing, nor does it aid in the subsequent analysis of these datasets to be required to do so. I would also not like to lose accuracy due to truncated values in the filename, nor have lengthy strings for the filename. While I could investigate Parallel Computing methods using this post, I feel this has been addressed quite well already, and I recognize my error is likely in the method of how I import and export, and what file formats I use, as the used format .wdx is slow, but cross-system compatible.



      I know the following:



      • .wdx is appropriate for cross-compatibility between systems


      • .mx is faster, but only able to be loaded on those systems with matching $SystemWordLength


      • The most efficient method of exporting is using DumpSave and Get, but it doesn't work for pushing out a non-square table, nor is it entirely applicable for compatibility across systems or versions of the WL & Mathematica.

      My question, made gigantic:



      What is the most (more than above) efficient (fast, low memory-impact[we can clear it every time, but I am not even aware of what function to investigate for this], low-core-count-requirement) method, given my above code & following intended use, for the export and import of tensor-style (non-traditional dimensions) matrices, or lists-of-lists-of-lists(-of-lists)?



      In other words, what is the right format to use in this case, in order to provide for quick export and import of many (~10,000+) files containing matrices of non-traditional tensor-style dimensions, id est, lists-of-lists-of-lists(-of-lists)?



      Please note, prior to answering, that this must be a method that is cross-compatible, and appropriate for arbitrary systems, what I mean by this is something like a dual core should be able to load it (assuming enough available memory, of course[if you can solve this too, you are welcome to include it in your answer!]) in a feasible amount of time, thus allowing a 4+ core setup to load it faster, albeit with similarly impacted memory usage.



      Here are some other, unmentioned, resources which I found useful, but I am left with a lot of older methods, and nothing that is "current" in such a way that I am unsure if .wdx remains a current format not yet replaced by .wxf or .wdf, what came first the chicken or the egg, or even if .mx is indeed cross-compatible [barring the aforementioned $SystemWordLength] (perhaps we can send around the same dataset and see if it stays consistent(I wonder how the CheckSum changes...don't answer that...but that would be fun wouldn't it?...)):



      Relevant References & Resources:



      • How do I save a variable or function definition to a file?

      • Save expression and load them into another notebook efficiently?

      • Fast way to export large amount of data in "Table" format

      • How to save all data in all variables so that loading it is fast?

      If this seems like an unresearched post, or may be incomplete, I've likely missed something glaring, so please let me know where I can provide clarifications or improvements, as I wish to have a resource for all through this question and subsequent answers, which will alleviate future concerns as to the most ideal method of export and import of lists-of-lists-of-lists(-of-lists) of non-traditional tensor-style dimensions. Also it is very late, and I need to sleep, but I will edit this message of sleep away in the morning tomorrow, in hopes of providing further clarification, or to begin collating time & memory tests.



      I've only just stumbled upon the above methods of compression and binary exportation, but the current ability to understand this escapes me....I fear my solution is there, and I may just be posting this only to be marked as a crazy-overdeveloped-duplicate question, but it would be awesome if this method was revitalized here, using the newest methods that have been determined by the community, hint hint...maybe?



      Also this is super duper relevant, though it is not cross-compatible...but I need sleep, for real!



      Thank you in advance to all who take the time to read through this with a serious eye, I hope that my lack of brevity is not too lax in the way I express my sheer passion and enjoyment of Mathematica & (the) Wolfram Language within this question post. Everyone here is awesome, and I am very excited to see what inputs you all have for this. Enjoy!










      share|improve this question









      $endgroup$




      Hello World!



      (I did it for my first answer, so it only feels proper...I'm not nostalgic, you are!)



      OKAY, here we go!



      Let's start off with the good-stuff, here's my code, made relatively arbitrary, as I cannot (yet) share the actual code, though I will be happy to test methods with my real code and provide, in the end, a nice display of the collated methods and their efficacies, with attributions to those who contribute:



      YourMainFunction[v1_?NumericQ,v2_?NumericQ,v3_?NumericQ,d1_?NumericQ,n_?IntegerQ]:=
      (Export[NoteBookDirectory[]<>ToString[N[v1,8]]<>"v1"<>ToString[N[v2,5]]<>"v2"<>ToString[N[v3,5]]<>"v3.wdx",
      Transpose[SortBy[Transpose[Eigensystem[N[f[v1,v2,v3,d1,n]]],Re[#[[1]]]&][[n+1;;2 n]]]);


      On the surface, I am running an Eigensystem calculation for a function of 3 arbitrary inputs of numerical values that is size n-by-n, and I output the following format:



      v1, v2, v3,eVals,eVecs


      It must be mentioned that while the eVals are real, the eVecs are complex, and must be kept this way, if remotely possible.



      However, I do realize I can input the parameters into the filename, and only contain the Eigensystem outputs in my exported files, like so:



      expr=Transpose[SortBy[Transpose[Eigensystem[N[f[v1,v2,v3]]],Re[#[[1]]]&][[n+1;;2 n]]];
      Export[NotebookDirectory[]<>ToString[N[v1,8]]<>"v1"<>ToString[N[v2,5]]<>"v2"<>ToString[N[v3,5]]<>"v3.wdx",expr];


      Which resulted in the above coding attempts.



      I will do this 10,000 times (100-by-100 combinations of v2 & v3) before I change the v1 value, or vice-versa, as there will be 1 main parameter, whilst the others are cycled through all combinations, I do this in a variety of ways, often either ParallelSubmit(most efficient) or ParallelTable(best for reimportation of ParallelSubmit-created data), but currently on a dual-core machine, I am unable to load the sheer volume of files in any feasible manner that can be completed in the length of a day let alone over a research meeting, and in-fact put too much stress on my hexa-core mobile workstation (though it does work! [albeit slowly]), and I only find comfort in my CPU load percentage when using a 16-core behemoth I keep caged at home.



      Each of these are stored separately, as this is best for stopping and starting mid-export. I then export the master list of the filenames of the exported files (this filename is what Export will output), but this can easily be recreated and properly collated if need be. It does not seem advantageous to reassemble the intended set format each time when importing, nor does it aid in the subsequent analysis of these datasets to be required to do so. I would also not like to lose accuracy due to truncated values in the filename, nor have lengthy strings for the filename. While I could investigate Parallel Computing methods using this post, I feel this has been addressed quite well already, and I recognize my error is likely in the method of how I import and export, and what file formats I use, as the used format .wdx is slow, but cross-system compatible.



      I know the following:



      • .wdx is appropriate for cross-compatibility between systems


      • .mx is faster, but only able to be loaded on those systems with matching $SystemWordLength


      • The most efficient method of exporting is using DumpSave and Get, but it doesn't work for pushing out a non-square table, nor is it entirely applicable for compatibility across systems or versions of the WL & Mathematica.

      My question, made gigantic:



      What is the most (more than above) efficient (fast, low memory-impact[we can clear it every time, but I am not even aware of what function to investigate for this], low-core-count-requirement) method, given my above code & following intended use, for the export and import of tensor-style (non-traditional dimensions) matrices, or lists-of-lists-of-lists(-of-lists)?



      In other words, what is the right format to use in this case, in order to provide for quick export and import of many (~10,000+) files containing matrices of non-traditional tensor-style dimensions, id est, lists-of-lists-of-lists(-of-lists)?



      Please note, prior to answering, that this must be a method that is cross-compatible, and appropriate for arbitrary systems, what I mean by this is something like a dual core should be able to load it (assuming enough available memory, of course[if you can solve this too, you are welcome to include it in your answer!]) in a feasible amount of time, thus allowing a 4+ core setup to load it faster, albeit with similarly impacted memory usage.



      Here are some other, unmentioned, resources which I found useful, but I am left with a lot of older methods, and nothing that is "current" in such a way that I am unsure if .wdx remains a current format not yet replaced by .wxf or .wdf, what came first the chicken or the egg, or even if .mx is indeed cross-compatible [barring the aforementioned $SystemWordLength] (perhaps we can send around the same dataset and see if it stays consistent(I wonder how the CheckSum changes...don't answer that...but that would be fun wouldn't it?...)):



      Relevant References & Resources:



      • How do I save a variable or function definition to a file?

      • Save expression and load them into another notebook efficiently?

      • Fast way to export large amount of data in "Table" format

      • How to save all data in all variables so that loading it is fast?

      If this seems like an unresearched post, or may be incomplete, I've likely missed something glaring, so please let me know where I can provide clarifications or improvements, as I wish to have a resource for all through this question and subsequent answers, which will alleviate future concerns as to the most ideal method of export and import of lists-of-lists-of-lists(-of-lists) of non-traditional tensor-style dimensions. Also it is very late, and I need to sleep, but I will edit this message of sleep away in the morning tomorrow, in hopes of providing further clarification, or to begin collating time & memory tests.



      I've only just stumbled upon the above methods of compression and binary exportation, but the current ability to understand this escapes me....I fear my solution is there, and I may just be posting this only to be marked as a crazy-overdeveloped-duplicate question, but it would be awesome if this method was revitalized here, using the newest methods that have been determined by the community, hint hint...maybe?



      Also this is super duper relevant, though it is not cross-compatible...but I need sleep, for real!



      Thank you in advance to all who take the time to read through this with a serious eye, I hope that my lack of brevity is not too lax in the way I express my sheer passion and enjoyment of Mathematica & (the) Wolfram Language within this question post. Everyone here is awesome, and I am very excited to see what inputs you all have for this. Enjoy!







      performance-tuning export import data data-structures






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked May 15 at 5:57









      CA TrevillianCA Trevillian

      375110




      375110




















          1 Answer
          1






          active

          oldest

          votes


















          12












          $begingroup$

          You might want to give the HDF5 format a try. It seems to be very efficient, even faster than MX in the example below.



          One has only to be careful to use the option "ComplexKeys" -> "Re", "Im" upon import (otherwise, each complex number is split into an association containing real and imginary part, rendering the method very inefficient).



          Export:



          n = 1000;
          A = RandomReal[-1, 1, n, n];

          λ, U = Eigensystem[A];
          Export["a.h5",
          "Eigenvalues" -> λ,
          "Eigenvectors" -> U
          , "Datasets"] // AbsoluteTiming



          0.019481, "a.h5"




          Import:



          μ = Import["a.h5", "Data", "/Eigenvalues", 
          "ComplexKeys" -> "Re", "Im"]; // AbsoluteTiming // First
          V = Import["a.h5", "Data", "/Eigenvectors",
          "ComplexKeys" -> "Re", "Im"]; // AbsoluteTiming // First



          0.011885



          0.01822




          Check:



          Max[Abs[λ - μ]]
          Max[Abs[U - V]]



          0.



          0.







          share|improve this answer











          $endgroup$












          • $begingroup$
            Wow! I’m going to learn a new data format today, this is awesome from first glance! And exactly who I hoped to hear from!! I’ll make my first run of timing tests and hope to add them this evening. I always find your posts to be immensely helpful, and this mention of associations is another skill I must gain. Thank you additionally for proving your method by a series of import and export, I presume hdf5 is not a Mathematica exclusive format, which may prove to make it even more universal? I’m curious as to how small the files are! This is exciting, you might have another winner already ;)
            $endgroup$
            – CA Trevillian
            May 15 at 15:18






          • 1




            $begingroup$
            I am glad to hear that you find my post stimulating. And yes, as far as I know, HDF5 is meant to be a universal and efficient file format.
            $endgroup$
            – Henrik Schumacher
            May 15 at 15:33






          • 2




            $begingroup$
            +1 for HDF5, a very useful format. Used commonly by space agencies and geospatial organizations the world over :) Also very portable - h5py for Python, h5 for R, etc.
            $endgroup$
            – Carl Lange
            May 21 at 13:30











          • $begingroup$
            Admittedly I haven’t implemented this yet, as I’ll have to reproduce my datas (laziest method) but I have been studying this, please, @CarlLange and Henrik, check my understanding: I can output my Eigenvectors and Eigenvalues with this format, this is good, then I would formulate my datasets such that I have a group according to the parameters I am using? Ie something like ...”/v1”,”/v2”,”/v3”...?
            $endgroup$
            – CA Trevillian
            May 21 at 13:59











          Your Answer








          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "387"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathematica.stackexchange.com%2fquestions%2f198372%2fmost-efficient-method-for-export-import-of-non-traditional-tensor-style-dimensio%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          12












          $begingroup$

          You might want to give the HDF5 format a try. It seems to be very efficient, even faster than MX in the example below.



          One has only to be careful to use the option "ComplexKeys" -> "Re", "Im" upon import (otherwise, each complex number is split into an association containing real and imginary part, rendering the method very inefficient).



          Export:



          n = 1000;
          A = RandomReal[-1, 1, n, n];

          λ, U = Eigensystem[A];
          Export["a.h5",
          "Eigenvalues" -> λ,
          "Eigenvectors" -> U
          , "Datasets"] // AbsoluteTiming



          0.019481, "a.h5"




          Import:



          μ = Import["a.h5", "Data", "/Eigenvalues", 
          "ComplexKeys" -> "Re", "Im"]; // AbsoluteTiming // First
          V = Import["a.h5", "Data", "/Eigenvectors",
          "ComplexKeys" -> "Re", "Im"]; // AbsoluteTiming // First



          0.011885



          0.01822




          Check:



          Max[Abs[λ - μ]]
          Max[Abs[U - V]]



          0.



          0.







          share|improve this answer











          $endgroup$












          • $begingroup$
            Wow! I’m going to learn a new data format today, this is awesome from first glance! And exactly who I hoped to hear from!! I’ll make my first run of timing tests and hope to add them this evening. I always find your posts to be immensely helpful, and this mention of associations is another skill I must gain. Thank you additionally for proving your method by a series of import and export, I presume hdf5 is not a Mathematica exclusive format, which may prove to make it even more universal? I’m curious as to how small the files are! This is exciting, you might have another winner already ;)
            $endgroup$
            – CA Trevillian
            May 15 at 15:18






          • 1




            $begingroup$
            I am glad to hear that you find my post stimulating. And yes, as far as I know, HDF5 is meant to be a universal and efficient file format.
            $endgroup$
            – Henrik Schumacher
            May 15 at 15:33






          • 2




            $begingroup$
            +1 for HDF5, a very useful format. Used commonly by space agencies and geospatial organizations the world over :) Also very portable - h5py for Python, h5 for R, etc.
            $endgroup$
            – Carl Lange
            May 21 at 13:30











          • $begingroup$
            Admittedly I haven’t implemented this yet, as I’ll have to reproduce my datas (laziest method) but I have been studying this, please, @CarlLange and Henrik, check my understanding: I can output my Eigenvectors and Eigenvalues with this format, this is good, then I would formulate my datasets such that I have a group according to the parameters I am using? Ie something like ...”/v1”,”/v2”,”/v3”...?
            $endgroup$
            – CA Trevillian
            May 21 at 13:59















          12












          $begingroup$

          You might want to give the HDF5 format a try. It seems to be very efficient, even faster than MX in the example below.



          One has only to be careful to use the option "ComplexKeys" -> "Re", "Im" upon import (otherwise, each complex number is split into an association containing real and imginary part, rendering the method very inefficient).



          Export:



          n = 1000;
          A = RandomReal[-1, 1, n, n];

          λ, U = Eigensystem[A];
          Export["a.h5",
          "Eigenvalues" -> λ,
          "Eigenvectors" -> U
          , "Datasets"] // AbsoluteTiming



          0.019481, "a.h5"




          Import:



          μ = Import["a.h5", "Data", "/Eigenvalues", 
          "ComplexKeys" -> "Re", "Im"]; // AbsoluteTiming // First
          V = Import["a.h5", "Data", "/Eigenvectors",
          "ComplexKeys" -> "Re", "Im"]; // AbsoluteTiming // First



          0.011885



          0.01822




          Check:



          Max[Abs[λ - μ]]
          Max[Abs[U - V]]



          0.



          0.







          share|improve this answer











          $endgroup$












          • $begingroup$
            Wow! I’m going to learn a new data format today, this is awesome from first glance! And exactly who I hoped to hear from!! I’ll make my first run of timing tests and hope to add them this evening. I always find your posts to be immensely helpful, and this mention of associations is another skill I must gain. Thank you additionally for proving your method by a series of import and export, I presume hdf5 is not a Mathematica exclusive format, which may prove to make it even more universal? I’m curious as to how small the files are! This is exciting, you might have another winner already ;)
            $endgroup$
            – CA Trevillian
            May 15 at 15:18






          • 1




            $begingroup$
            I am glad to hear that you find my post stimulating. And yes, as far as I know, HDF5 is meant to be a universal and efficient file format.
            $endgroup$
            – Henrik Schumacher
            May 15 at 15:33






          • 2




            $begingroup$
            +1 for HDF5, a very useful format. Used commonly by space agencies and geospatial organizations the world over :) Also very portable - h5py for Python, h5 for R, etc.
            $endgroup$
            – Carl Lange
            May 21 at 13:30











          • $begingroup$
            Admittedly I haven’t implemented this yet, as I’ll have to reproduce my datas (laziest method) but I have been studying this, please, @CarlLange and Henrik, check my understanding: I can output my Eigenvectors and Eigenvalues with this format, this is good, then I would formulate my datasets such that I have a group according to the parameters I am using? Ie something like ...”/v1”,”/v2”,”/v3”...?
            $endgroup$
            – CA Trevillian
            May 21 at 13:59













          12












          12








          12





          $begingroup$

          You might want to give the HDF5 format a try. It seems to be very efficient, even faster than MX in the example below.



          One has only to be careful to use the option "ComplexKeys" -> "Re", "Im" upon import (otherwise, each complex number is split into an association containing real and imginary part, rendering the method very inefficient).



          Export:



          n = 1000;
          A = RandomReal[-1, 1, n, n];

          λ, U = Eigensystem[A];
          Export["a.h5",
          "Eigenvalues" -> λ,
          "Eigenvectors" -> U
          , "Datasets"] // AbsoluteTiming



          0.019481, "a.h5"




          Import:



          μ = Import["a.h5", "Data", "/Eigenvalues", 
          "ComplexKeys" -> "Re", "Im"]; // AbsoluteTiming // First
          V = Import["a.h5", "Data", "/Eigenvectors",
          "ComplexKeys" -> "Re", "Im"]; // AbsoluteTiming // First



          0.011885



          0.01822




          Check:



          Max[Abs[λ - μ]]
          Max[Abs[U - V]]



          0.



          0.







          share|improve this answer











          $endgroup$



          You might want to give the HDF5 format a try. It seems to be very efficient, even faster than MX in the example below.



          One has only to be careful to use the option "ComplexKeys" -> "Re", "Im" upon import (otherwise, each complex number is split into an association containing real and imginary part, rendering the method very inefficient).



          Export:



          n = 1000;
          A = RandomReal[-1, 1, n, n];

          λ, U = Eigensystem[A];
          Export["a.h5",
          "Eigenvalues" -> λ,
          "Eigenvectors" -> U
          , "Datasets"] // AbsoluteTiming



          0.019481, "a.h5"




          Import:



          μ = Import["a.h5", "Data", "/Eigenvalues", 
          "ComplexKeys" -> "Re", "Im"]; // AbsoluteTiming // First
          V = Import["a.h5", "Data", "/Eigenvectors",
          "ComplexKeys" -> "Re", "Im"]; // AbsoluteTiming // First



          0.011885



          0.01822




          Check:



          Max[Abs[λ - μ]]
          Max[Abs[U - V]]



          0.



          0.








          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited May 15 at 15:33

























          answered May 15 at 6:39









          Henrik SchumacherHenrik Schumacher

          63.2k587176




          63.2k587176











          • $begingroup$
            Wow! I’m going to learn a new data format today, this is awesome from first glance! And exactly who I hoped to hear from!! I’ll make my first run of timing tests and hope to add them this evening. I always find your posts to be immensely helpful, and this mention of associations is another skill I must gain. Thank you additionally for proving your method by a series of import and export, I presume hdf5 is not a Mathematica exclusive format, which may prove to make it even more universal? I’m curious as to how small the files are! This is exciting, you might have another winner already ;)
            $endgroup$
            – CA Trevillian
            May 15 at 15:18






          • 1




            $begingroup$
            I am glad to hear that you find my post stimulating. And yes, as far as I know, HDF5 is meant to be a universal and efficient file format.
            $endgroup$
            – Henrik Schumacher
            May 15 at 15:33






          • 2




            $begingroup$
            +1 for HDF5, a very useful format. Used commonly by space agencies and geospatial organizations the world over :) Also very portable - h5py for Python, h5 for R, etc.
            $endgroup$
            – Carl Lange
            May 21 at 13:30











          • $begingroup$
            Admittedly I haven’t implemented this yet, as I’ll have to reproduce my datas (laziest method) but I have been studying this, please, @CarlLange and Henrik, check my understanding: I can output my Eigenvectors and Eigenvalues with this format, this is good, then I would formulate my datasets such that I have a group according to the parameters I am using? Ie something like ...”/v1”,”/v2”,”/v3”...?
            $endgroup$
            – CA Trevillian
            May 21 at 13:59
















          • $begingroup$
            Wow! I’m going to learn a new data format today, this is awesome from first glance! And exactly who I hoped to hear from!! I’ll make my first run of timing tests and hope to add them this evening. I always find your posts to be immensely helpful, and this mention of associations is another skill I must gain. Thank you additionally for proving your method by a series of import and export, I presume hdf5 is not a Mathematica exclusive format, which may prove to make it even more universal? I’m curious as to how small the files are! This is exciting, you might have another winner already ;)
            $endgroup$
            – CA Trevillian
            May 15 at 15:18






          • 1




            $begingroup$
            I am glad to hear that you find my post stimulating. And yes, as far as I know, HDF5 is meant to be a universal and efficient file format.
            $endgroup$
            – Henrik Schumacher
            May 15 at 15:33






          • 2




            $begingroup$
            +1 for HDF5, a very useful format. Used commonly by space agencies and geospatial organizations the world over :) Also very portable - h5py for Python, h5 for R, etc.
            $endgroup$
            – Carl Lange
            May 21 at 13:30











          • $begingroup$
            Admittedly I haven’t implemented this yet, as I’ll have to reproduce my datas (laziest method) but I have been studying this, please, @CarlLange and Henrik, check my understanding: I can output my Eigenvectors and Eigenvalues with this format, this is good, then I would formulate my datasets such that I have a group according to the parameters I am using? Ie something like ...”/v1”,”/v2”,”/v3”...?
            $endgroup$
            – CA Trevillian
            May 21 at 13:59















          $begingroup$
          Wow! I’m going to learn a new data format today, this is awesome from first glance! And exactly who I hoped to hear from!! I’ll make my first run of timing tests and hope to add them this evening. I always find your posts to be immensely helpful, and this mention of associations is another skill I must gain. Thank you additionally for proving your method by a series of import and export, I presume hdf5 is not a Mathematica exclusive format, which may prove to make it even more universal? I’m curious as to how small the files are! This is exciting, you might have another winner already ;)
          $endgroup$
          – CA Trevillian
          May 15 at 15:18




          $begingroup$
          Wow! I’m going to learn a new data format today, this is awesome from first glance! And exactly who I hoped to hear from!! I’ll make my first run of timing tests and hope to add them this evening. I always find your posts to be immensely helpful, and this mention of associations is another skill I must gain. Thank you additionally for proving your method by a series of import and export, I presume hdf5 is not a Mathematica exclusive format, which may prove to make it even more universal? I’m curious as to how small the files are! This is exciting, you might have another winner already ;)
          $endgroup$
          – CA Trevillian
          May 15 at 15:18




          1




          1




          $begingroup$
          I am glad to hear that you find my post stimulating. And yes, as far as I know, HDF5 is meant to be a universal and efficient file format.
          $endgroup$
          – Henrik Schumacher
          May 15 at 15:33




          $begingroup$
          I am glad to hear that you find my post stimulating. And yes, as far as I know, HDF5 is meant to be a universal and efficient file format.
          $endgroup$
          – Henrik Schumacher
          May 15 at 15:33




          2




          2




          $begingroup$
          +1 for HDF5, a very useful format. Used commonly by space agencies and geospatial organizations the world over :) Also very portable - h5py for Python, h5 for R, etc.
          $endgroup$
          – Carl Lange
          May 21 at 13:30





          $begingroup$
          +1 for HDF5, a very useful format. Used commonly by space agencies and geospatial organizations the world over :) Also very portable - h5py for Python, h5 for R, etc.
          $endgroup$
          – Carl Lange
          May 21 at 13:30













          $begingroup$
          Admittedly I haven’t implemented this yet, as I’ll have to reproduce my datas (laziest method) but I have been studying this, please, @CarlLange and Henrik, check my understanding: I can output my Eigenvectors and Eigenvalues with this format, this is good, then I would formulate my datasets such that I have a group according to the parameters I am using? Ie something like ...”/v1”,”/v2”,”/v3”...?
          $endgroup$
          – CA Trevillian
          May 21 at 13:59




          $begingroup$
          Admittedly I haven’t implemented this yet, as I’ll have to reproduce my datas (laziest method) but I have been studying this, please, @CarlLange and Henrik, check my understanding: I can output my Eigenvectors and Eigenvalues with this format, this is good, then I would formulate my datasets such that I have a group according to the parameters I am using? Ie something like ...”/v1”,”/v2”,”/v3”...?
          $endgroup$
          – CA Trevillian
          May 21 at 13:59

















          draft saved

          draft discarded
















































          Thanks for contributing an answer to Mathematica Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          Use MathJax to format equations. MathJax reference.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathematica.stackexchange.com%2fquestions%2f198372%2fmost-efficient-method-for-export-import-of-non-traditional-tensor-style-dimensio%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Wikipedia:Vital articles Мазмуну Biography - Өмүр баян Philosophy and psychology - Философия жана психология Religion - Дин Social sciences - Коомдук илимдер Language and literature - Тил жана адабият Science - Илим Technology - Технология Arts and recreation - Искусство жана эс алуу History and geography - Тарых жана география Навигация менюсу

          Bruxelas-Capital Índice Historia | Composición | Situación lingüística | Clima | Cidades irmandadas | Notas | Véxase tamén | Menú de navegacióneO uso das linguas en Bruxelas e a situación do neerlandés"Rexión de Bruxelas Capital"o orixinalSitio da rexiónPáxina de Bruselas no sitio da Oficina de Promoción Turística de Valonia e BruxelasMapa Interactivo da Rexión de Bruxelas-CapitaleeWorldCat332144929079854441105155190212ID28008674080552-90000 0001 0666 3698n94104302ID540940339365017018237

          What should I write in an apology letter, since I have decided not to join a company after accepting an offer letterShould I keep looking after accepting a job offer?What should I do when I've been verbally told I would get an offer letter, but still haven't gotten one after 4 weeks?Do I accept an offer from a company that I am not likely to join?New job hasn't confirmed starting date and I want to give current employer as much notice as possibleHow should I address my manager in my resignation letter?HR delayed background verification, now jobless as resignedNo email communication after accepting a formal written offer. How should I phrase the call?What should I do if after receiving a verbal offer letter I am informed that my written job offer is put on hold due to some internal issues?Should I inform the current employer that I am about to resign within 1-2 weeks since I have signed the offer letter and waiting for visa?What company will do, if I send their offer letter to another company