Replacement of Do loopsIs there a faster way to calculate Abs[z]^2 numerically?Rapidly binning 3D dataCounting Events in a moving date rangeFunction takes hugely different times to evaluate on sets of virtually (but also literally) identical dataFinding planetary conjunctions with Mathematica (project-level)Performance of calculation of mean squared displacementExtraction of specific image pixels as bytesPerformance of SelectCan this code be changed to run faster?Programming with Mathematica Syntax - Dunford decompositionReading total number of frames in an AVI
Way of refund if scammed?
One word for 'the thing that attracts me'?
Is being an extrovert a necessary condition to be a manager?
What does it mean when みたいな is at the end of a sentence?
(For training purposes) Are there any openings with rook pawns that are more effective than others (and if so, what are they)?
To exponential digit growth and beyond!
Existence of a model of ZFC in which the natural numbers are really the natural numbers
How can I reduce the size of matrix?
Three knights or knaves, three different hair colors
Nunc est bibendum: gerund or gerundive?
Does science define life as "beginning at conception"?
Caught with my phone during an exam
csname in newenviroment
Was murdering a slave illegal in American slavery, and if so, what punishments were given for it?
Why is this integration method not valid?
Does the fact that we can only measure the two-way speed of light undermine the axiom of invariance?
Computing elements of a 1000 x 60 matrix exhausts RAM
How to become an Editorial board member?
A nasty indefinite integral
Why is 'additive' EQ more difficult to use than 'subtractive'?
Why "strap-on" boosters, and how do other people say it?
Coloring lines in a graph the same color if they are the same length
How to make Flex Markers appear in Logic Pro X?
Can someone get a spouse off a deed that never lived together and was incarcerated?
Replacement of Do loops
Is there a faster way to calculate Abs[z]^2 numerically?Rapidly binning 3D dataCounting Events in a moving date rangeFunction takes hugely different times to evaluate on sets of virtually (but also literally) identical dataFinding planetary conjunctions with Mathematica (project-level)Performance of calculation of mean squared displacementExtraction of specific image pixels as bytesPerformance of SelectCan this code be changed to run faster?Programming with Mathematica Syntax - Dunford decompositionReading total number of frames in an AVI
$begingroup$
Here is an example code (to calculate the so called structure factor, please see the comment below the question) for a small data set of 2d coordinates. In reality I have a factor of up to 100 times more coordinates.
Code with AbsoluteTiming
:
coordinates = Get@"https://pastebin.com/raw/wFxJ1m4U";
lX, lZ = 100, 100;
kX = 2 Pi/lX*(Range[lX] - lX/2) // N;
kZ = 2 Pi/lZ*(Range[lZ] - lZ/2) // N;
cx = Flatten[coordinates[[All, 1]]];
cz = Flatten[coordinates[[All, 2]]];
rX = Flatten@Outer[Differences[##] &, cx, cx];
rZ = Flatten@Outer[Differences[##] &, cz, cz];
nP = Length[coordinates];
s = Array[0 &, lZ, lX];
Do[
Do[
s[[j, i]] = 1/nP*Total[Cos[kX[[i]]*rX + kZ[[j]]*rZ]]
, i, 1, lX
]
, j, 1, lZ
]; // AbsoluteTiming
31.3555, Null
How can I speed up this code?
A huge problem occurs for many coordinates: for about 2400 coordinates the calculation takes about 3 days!
Here are 2403 coordinates: https://pastebin.com/raw/0t1MBAgX .
Please use for this test lX, lZ = 1600, 1200
Here is the code for this large data set: https://pastebin.com/raw/MH5Pb4h0
Can the double loop be compiled?
performance-tuning
$endgroup$
|
show 5 more comments
$begingroup$
Here is an example code (to calculate the so called structure factor, please see the comment below the question) for a small data set of 2d coordinates. In reality I have a factor of up to 100 times more coordinates.
Code with AbsoluteTiming
:
coordinates = Get@"https://pastebin.com/raw/wFxJ1m4U";
lX, lZ = 100, 100;
kX = 2 Pi/lX*(Range[lX] - lX/2) // N;
kZ = 2 Pi/lZ*(Range[lZ] - lZ/2) // N;
cx = Flatten[coordinates[[All, 1]]];
cz = Flatten[coordinates[[All, 2]]];
rX = Flatten@Outer[Differences[##] &, cx, cx];
rZ = Flatten@Outer[Differences[##] &, cz, cz];
nP = Length[coordinates];
s = Array[0 &, lZ, lX];
Do[
Do[
s[[j, i]] = 1/nP*Total[Cos[kX[[i]]*rX + kZ[[j]]*rZ]]
, i, 1, lX
]
, j, 1, lZ
]; // AbsoluteTiming
31.3555, Null
How can I speed up this code?
A huge problem occurs for many coordinates: for about 2400 coordinates the calculation takes about 3 days!
Here are 2403 coordinates: https://pastebin.com/raw/0t1MBAgX .
Please use for this test lX, lZ = 1600, 1200
Here is the code for this large data set: https://pastebin.com/raw/MH5Pb4h0
Can the double loop be compiled?
performance-tuning
$endgroup$
2
$begingroup$
Can you add information about what this code is supposed to do to your question?
$endgroup$
– Carl Lange
May 8 at 8:17
$begingroup$
@Carl Lange: It calculates the structures factor of a crystaline structure (S(k) algorithm from a paper by Henrich et al., arXiv:1001.3342v1, equation 2.6, pdf file: arxiv.org/pdf/1001.3342.pdf)
$endgroup$
– lio
May 8 at 8:38
1
$begingroup$
On my computer your first (non-parallel) code only takes 14.4 seconds. Replacing the last double loop withs = Outer[Total[Cos[#1 + #2]] &, KroneckerProduct[kZ, rZ], KroneckerProduct[kX, rX], 1]/nP
cuts this down to 8.6 seconds. I have no idea where your excessive runtime of 3714.41 seconds comes from.
$endgroup$
– Roman
May 8 at 8:51
1
$begingroup$
I agree with @Roman, on my computer your code took about 14s as well.
$endgroup$
– Turgon
May 8 at 8:52
2
$begingroup$
Have you run your own code in a fresh kernel? What is your Mathematica version? Have you tried replacing the cosine sum by aFourier
orFourierDCT
call?
$endgroup$
– Roman
May 8 at 9:01
|
show 5 more comments
$begingroup$
Here is an example code (to calculate the so called structure factor, please see the comment below the question) for a small data set of 2d coordinates. In reality I have a factor of up to 100 times more coordinates.
Code with AbsoluteTiming
:
coordinates = Get@"https://pastebin.com/raw/wFxJ1m4U";
lX, lZ = 100, 100;
kX = 2 Pi/lX*(Range[lX] - lX/2) // N;
kZ = 2 Pi/lZ*(Range[lZ] - lZ/2) // N;
cx = Flatten[coordinates[[All, 1]]];
cz = Flatten[coordinates[[All, 2]]];
rX = Flatten@Outer[Differences[##] &, cx, cx];
rZ = Flatten@Outer[Differences[##] &, cz, cz];
nP = Length[coordinates];
s = Array[0 &, lZ, lX];
Do[
Do[
s[[j, i]] = 1/nP*Total[Cos[kX[[i]]*rX + kZ[[j]]*rZ]]
, i, 1, lX
]
, j, 1, lZ
]; // AbsoluteTiming
31.3555, Null
How can I speed up this code?
A huge problem occurs for many coordinates: for about 2400 coordinates the calculation takes about 3 days!
Here are 2403 coordinates: https://pastebin.com/raw/0t1MBAgX .
Please use for this test lX, lZ = 1600, 1200
Here is the code for this large data set: https://pastebin.com/raw/MH5Pb4h0
Can the double loop be compiled?
performance-tuning
$endgroup$
Here is an example code (to calculate the so called structure factor, please see the comment below the question) for a small data set of 2d coordinates. In reality I have a factor of up to 100 times more coordinates.
Code with AbsoluteTiming
:
coordinates = Get@"https://pastebin.com/raw/wFxJ1m4U";
lX, lZ = 100, 100;
kX = 2 Pi/lX*(Range[lX] - lX/2) // N;
kZ = 2 Pi/lZ*(Range[lZ] - lZ/2) // N;
cx = Flatten[coordinates[[All, 1]]];
cz = Flatten[coordinates[[All, 2]]];
rX = Flatten@Outer[Differences[##] &, cx, cx];
rZ = Flatten@Outer[Differences[##] &, cz, cz];
nP = Length[coordinates];
s = Array[0 &, lZ, lX];
Do[
Do[
s[[j, i]] = 1/nP*Total[Cos[kX[[i]]*rX + kZ[[j]]*rZ]]
, i, 1, lX
]
, j, 1, lZ
]; // AbsoluteTiming
31.3555, Null
How can I speed up this code?
A huge problem occurs for many coordinates: for about 2400 coordinates the calculation takes about 3 days!
Here are 2403 coordinates: https://pastebin.com/raw/0t1MBAgX .
Please use for this test lX, lZ = 1600, 1200
Here is the code for this large data set: https://pastebin.com/raw/MH5Pb4h0
Can the double loop be compiled?
performance-tuning
performance-tuning
edited May 14 at 17:10
lio
asked May 8 at 8:10
liolio
1,131318
1,131318
2
$begingroup$
Can you add information about what this code is supposed to do to your question?
$endgroup$
– Carl Lange
May 8 at 8:17
$begingroup$
@Carl Lange: It calculates the structures factor of a crystaline structure (S(k) algorithm from a paper by Henrich et al., arXiv:1001.3342v1, equation 2.6, pdf file: arxiv.org/pdf/1001.3342.pdf)
$endgroup$
– lio
May 8 at 8:38
1
$begingroup$
On my computer your first (non-parallel) code only takes 14.4 seconds. Replacing the last double loop withs = Outer[Total[Cos[#1 + #2]] &, KroneckerProduct[kZ, rZ], KroneckerProduct[kX, rX], 1]/nP
cuts this down to 8.6 seconds. I have no idea where your excessive runtime of 3714.41 seconds comes from.
$endgroup$
– Roman
May 8 at 8:51
1
$begingroup$
I agree with @Roman, on my computer your code took about 14s as well.
$endgroup$
– Turgon
May 8 at 8:52
2
$begingroup$
Have you run your own code in a fresh kernel? What is your Mathematica version? Have you tried replacing the cosine sum by aFourier
orFourierDCT
call?
$endgroup$
– Roman
May 8 at 9:01
|
show 5 more comments
2
$begingroup$
Can you add information about what this code is supposed to do to your question?
$endgroup$
– Carl Lange
May 8 at 8:17
$begingroup$
@Carl Lange: It calculates the structures factor of a crystaline structure (S(k) algorithm from a paper by Henrich et al., arXiv:1001.3342v1, equation 2.6, pdf file: arxiv.org/pdf/1001.3342.pdf)
$endgroup$
– lio
May 8 at 8:38
1
$begingroup$
On my computer your first (non-parallel) code only takes 14.4 seconds. Replacing the last double loop withs = Outer[Total[Cos[#1 + #2]] &, KroneckerProduct[kZ, rZ], KroneckerProduct[kX, rX], 1]/nP
cuts this down to 8.6 seconds. I have no idea where your excessive runtime of 3714.41 seconds comes from.
$endgroup$
– Roman
May 8 at 8:51
1
$begingroup$
I agree with @Roman, on my computer your code took about 14s as well.
$endgroup$
– Turgon
May 8 at 8:52
2
$begingroup$
Have you run your own code in a fresh kernel? What is your Mathematica version? Have you tried replacing the cosine sum by aFourier
orFourierDCT
call?
$endgroup$
– Roman
May 8 at 9:01
2
2
$begingroup$
Can you add information about what this code is supposed to do to your question?
$endgroup$
– Carl Lange
May 8 at 8:17
$begingroup$
Can you add information about what this code is supposed to do to your question?
$endgroup$
– Carl Lange
May 8 at 8:17
$begingroup$
@Carl Lange: It calculates the structures factor of a crystaline structure (S(k) algorithm from a paper by Henrich et al., arXiv:1001.3342v1, equation 2.6, pdf file: arxiv.org/pdf/1001.3342.pdf)
$endgroup$
– lio
May 8 at 8:38
$begingroup$
@Carl Lange: It calculates the structures factor of a crystaline structure (S(k) algorithm from a paper by Henrich et al., arXiv:1001.3342v1, equation 2.6, pdf file: arxiv.org/pdf/1001.3342.pdf)
$endgroup$
– lio
May 8 at 8:38
1
1
$begingroup$
On my computer your first (non-parallel) code only takes 14.4 seconds. Replacing the last double loop with
s = Outer[Total[Cos[#1 + #2]] &, KroneckerProduct[kZ, rZ], KroneckerProduct[kX, rX], 1]/nP
cuts this down to 8.6 seconds. I have no idea where your excessive runtime of 3714.41 seconds comes from.$endgroup$
– Roman
May 8 at 8:51
$begingroup$
On my computer your first (non-parallel) code only takes 14.4 seconds. Replacing the last double loop with
s = Outer[Total[Cos[#1 + #2]] &, KroneckerProduct[kZ, rZ], KroneckerProduct[kX, rX], 1]/nP
cuts this down to 8.6 seconds. I have no idea where your excessive runtime of 3714.41 seconds comes from.$endgroup$
– Roman
May 8 at 8:51
1
1
$begingroup$
I agree with @Roman, on my computer your code took about 14s as well.
$endgroup$
– Turgon
May 8 at 8:52
$begingroup$
I agree with @Roman, on my computer your code took about 14s as well.
$endgroup$
– Turgon
May 8 at 8:52
2
2
$begingroup$
Have you run your own code in a fresh kernel? What is your Mathematica version? Have you tried replacing the cosine sum by a
Fourier
or FourierDCT
call?$endgroup$
– Roman
May 8 at 9:01
$begingroup$
Have you run your own code in a fresh kernel? What is your Mathematica version? Have you tried replacing the cosine sum by a
Fourier
or FourierDCT
call?$endgroup$
– Roman
May 8 at 9:01
|
show 5 more comments
1 Answer
1
active
oldest
votes
$begingroup$
Table
works a little bit better than Do
:
s = Table[Total[Cos[kX[[i]]*rX + kZ[[j]]*rZ]], j, 1, lZ, i, 1, lX]/nP;
Much more speed can be gained by first rewriting the cosines as imaginary exponentials, and then rewriting exponentials of sums as products of exponentials. As a result, rX
and rZ
aren't needed, and the calculation takes 0.03 seconds, or just over one second with the large data set:
s = Abs[Exp[I*KroneckerProduct[kZ, cz]].Exp[I*KroneckerProduct[cx, kX]]]^2/nP; //
AbsoluteTiming // First
(* 1.22367 *)
A little more speed can be gained with this trick of using an internal function to calculate Abs[...]^2
:
s = Internal`AbsSquare[
Exp[I*KroneckerProduct[kZ, cz]].Exp[I*KroneckerProduct[cx, kX]]]/nP; //
AbsoluteTiming // First
(* 1.09222 *)
A bit more speed could be gained by replacing the Kronecker products by a NestList
, as the values in kX
and kZ
are uniformly spaced and thus the many exponentials can be replaced by repeated multiplications of a single exponential. This is numerically less stable though, and in any case at this point 50% of the time are spent in the Dot
product which is unavoidable, so the best remaining speedup you could hope for is another factor of two.
Lesson of the day: don't think about compiling before you've tried rewriting the algorithm.
mathematical derivation
Starting from Eq. (2.6) of https://arxiv.org/pdf/1001.3342.pdf
$$
S_vecq=frac1Nsum_i j cos[vecqcdot(vecr_i-vecr_j)]
= frac1NResum_i j e^texti vecqcdot(vecr_i-vecr_j)
= frac1NResum_i j e^texti vecqcdotvecr_ie^-textivecqcdotvecr_j\
= frac1NReleft[sum_i e^texti vecqcdotvecr_iright]left[sum_j e^-texti vecqcdotvecr_jright]
= frac1Nleft|sum_i e^texti vecqcdotvecr_iright|^2
= frac1Nleft|sum_i e^texti q_xr_i,xe^texti q_zr_i,zright|^2
$$
From there it's a matter of calculating all the possible combinations of $e^texti q_xr_i,x$ and $e^texti q_zr_i,z$ with all desired $vecq$-values (in kX
and kZ
), and then summing over them. These operations are most efficiently done with vector operations like KroneckerProduct
, listable Exp
, Dot
-products, and finally a listable Abs
or AbsSquare
.
$endgroup$
$begingroup$
Can you check if this result is equal to my s. I named your result ass2
and my ass
. And if I checks==s2
it givesFalse
. Please see also(s[[#]] == s2[[#]]) & /@ Range[lZ]
. Nevertheless this difference is neglectable.
$endgroup$
– lio
May 8 at 12:10
$begingroup$
Again, please make sure you use a fresh kernel to run calculations.
$endgroup$
– Roman
May 8 at 12:14
$begingroup$
I did it but still getFalse
(Mathematica 12.0)
$endgroup$
– lio
May 8 at 12:17
2
$begingroup$
@b3m2a1 see my updated solution: if you rewrite these cosines as exponentials, then you only need toKroneckerProduct
withcx
instead ofrX
, and the result easily fits into memory. In effect,rX
containscx
twice, and this double-use is replaced by anAbs[...]^2
call.
$endgroup$
– Roman
May 10 at 12:31
1
$begingroup$
@cphys or simplys-s2 // Abs // Max
$endgroup$
– Roman
May 15 at 20:46
|
show 12 more comments
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "387"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathematica.stackexchange.com%2fquestions%2f197924%2freplacement-of-do-loops%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Table
works a little bit better than Do
:
s = Table[Total[Cos[kX[[i]]*rX + kZ[[j]]*rZ]], j, 1, lZ, i, 1, lX]/nP;
Much more speed can be gained by first rewriting the cosines as imaginary exponentials, and then rewriting exponentials of sums as products of exponentials. As a result, rX
and rZ
aren't needed, and the calculation takes 0.03 seconds, or just over one second with the large data set:
s = Abs[Exp[I*KroneckerProduct[kZ, cz]].Exp[I*KroneckerProduct[cx, kX]]]^2/nP; //
AbsoluteTiming // First
(* 1.22367 *)
A little more speed can be gained with this trick of using an internal function to calculate Abs[...]^2
:
s = Internal`AbsSquare[
Exp[I*KroneckerProduct[kZ, cz]].Exp[I*KroneckerProduct[cx, kX]]]/nP; //
AbsoluteTiming // First
(* 1.09222 *)
A bit more speed could be gained by replacing the Kronecker products by a NestList
, as the values in kX
and kZ
are uniformly spaced and thus the many exponentials can be replaced by repeated multiplications of a single exponential. This is numerically less stable though, and in any case at this point 50% of the time are spent in the Dot
product which is unavoidable, so the best remaining speedup you could hope for is another factor of two.
Lesson of the day: don't think about compiling before you've tried rewriting the algorithm.
mathematical derivation
Starting from Eq. (2.6) of https://arxiv.org/pdf/1001.3342.pdf
$$
S_vecq=frac1Nsum_i j cos[vecqcdot(vecr_i-vecr_j)]
= frac1NResum_i j e^texti vecqcdot(vecr_i-vecr_j)
= frac1NResum_i j e^texti vecqcdotvecr_ie^-textivecqcdotvecr_j\
= frac1NReleft[sum_i e^texti vecqcdotvecr_iright]left[sum_j e^-texti vecqcdotvecr_jright]
= frac1Nleft|sum_i e^texti vecqcdotvecr_iright|^2
= frac1Nleft|sum_i e^texti q_xr_i,xe^texti q_zr_i,zright|^2
$$
From there it's a matter of calculating all the possible combinations of $e^texti q_xr_i,x$ and $e^texti q_zr_i,z$ with all desired $vecq$-values (in kX
and kZ
), and then summing over them. These operations are most efficiently done with vector operations like KroneckerProduct
, listable Exp
, Dot
-products, and finally a listable Abs
or AbsSquare
.
$endgroup$
$begingroup$
Can you check if this result is equal to my s. I named your result ass2
and my ass
. And if I checks==s2
it givesFalse
. Please see also(s[[#]] == s2[[#]]) & /@ Range[lZ]
. Nevertheless this difference is neglectable.
$endgroup$
– lio
May 8 at 12:10
$begingroup$
Again, please make sure you use a fresh kernel to run calculations.
$endgroup$
– Roman
May 8 at 12:14
$begingroup$
I did it but still getFalse
(Mathematica 12.0)
$endgroup$
– lio
May 8 at 12:17
2
$begingroup$
@b3m2a1 see my updated solution: if you rewrite these cosines as exponentials, then you only need toKroneckerProduct
withcx
instead ofrX
, and the result easily fits into memory. In effect,rX
containscx
twice, and this double-use is replaced by anAbs[...]^2
call.
$endgroup$
– Roman
May 10 at 12:31
1
$begingroup$
@cphys or simplys-s2 // Abs // Max
$endgroup$
– Roman
May 15 at 20:46
|
show 12 more comments
$begingroup$
Table
works a little bit better than Do
:
s = Table[Total[Cos[kX[[i]]*rX + kZ[[j]]*rZ]], j, 1, lZ, i, 1, lX]/nP;
Much more speed can be gained by first rewriting the cosines as imaginary exponentials, and then rewriting exponentials of sums as products of exponentials. As a result, rX
and rZ
aren't needed, and the calculation takes 0.03 seconds, or just over one second with the large data set:
s = Abs[Exp[I*KroneckerProduct[kZ, cz]].Exp[I*KroneckerProduct[cx, kX]]]^2/nP; //
AbsoluteTiming // First
(* 1.22367 *)
A little more speed can be gained with this trick of using an internal function to calculate Abs[...]^2
:
s = Internal`AbsSquare[
Exp[I*KroneckerProduct[kZ, cz]].Exp[I*KroneckerProduct[cx, kX]]]/nP; //
AbsoluteTiming // First
(* 1.09222 *)
A bit more speed could be gained by replacing the Kronecker products by a NestList
, as the values in kX
and kZ
are uniformly spaced and thus the many exponentials can be replaced by repeated multiplications of a single exponential. This is numerically less stable though, and in any case at this point 50% of the time are spent in the Dot
product which is unavoidable, so the best remaining speedup you could hope for is another factor of two.
Lesson of the day: don't think about compiling before you've tried rewriting the algorithm.
mathematical derivation
Starting from Eq. (2.6) of https://arxiv.org/pdf/1001.3342.pdf
$$
S_vecq=frac1Nsum_i j cos[vecqcdot(vecr_i-vecr_j)]
= frac1NResum_i j e^texti vecqcdot(vecr_i-vecr_j)
= frac1NResum_i j e^texti vecqcdotvecr_ie^-textivecqcdotvecr_j\
= frac1NReleft[sum_i e^texti vecqcdotvecr_iright]left[sum_j e^-texti vecqcdotvecr_jright]
= frac1Nleft|sum_i e^texti vecqcdotvecr_iright|^2
= frac1Nleft|sum_i e^texti q_xr_i,xe^texti q_zr_i,zright|^2
$$
From there it's a matter of calculating all the possible combinations of $e^texti q_xr_i,x$ and $e^texti q_zr_i,z$ with all desired $vecq$-values (in kX
and kZ
), and then summing over them. These operations are most efficiently done with vector operations like KroneckerProduct
, listable Exp
, Dot
-products, and finally a listable Abs
or AbsSquare
.
$endgroup$
$begingroup$
Can you check if this result is equal to my s. I named your result ass2
and my ass
. And if I checks==s2
it givesFalse
. Please see also(s[[#]] == s2[[#]]) & /@ Range[lZ]
. Nevertheless this difference is neglectable.
$endgroup$
– lio
May 8 at 12:10
$begingroup$
Again, please make sure you use a fresh kernel to run calculations.
$endgroup$
– Roman
May 8 at 12:14
$begingroup$
I did it but still getFalse
(Mathematica 12.0)
$endgroup$
– lio
May 8 at 12:17
2
$begingroup$
@b3m2a1 see my updated solution: if you rewrite these cosines as exponentials, then you only need toKroneckerProduct
withcx
instead ofrX
, and the result easily fits into memory. In effect,rX
containscx
twice, and this double-use is replaced by anAbs[...]^2
call.
$endgroup$
– Roman
May 10 at 12:31
1
$begingroup$
@cphys or simplys-s2 // Abs // Max
$endgroup$
– Roman
May 15 at 20:46
|
show 12 more comments
$begingroup$
Table
works a little bit better than Do
:
s = Table[Total[Cos[kX[[i]]*rX + kZ[[j]]*rZ]], j, 1, lZ, i, 1, lX]/nP;
Much more speed can be gained by first rewriting the cosines as imaginary exponentials, and then rewriting exponentials of sums as products of exponentials. As a result, rX
and rZ
aren't needed, and the calculation takes 0.03 seconds, or just over one second with the large data set:
s = Abs[Exp[I*KroneckerProduct[kZ, cz]].Exp[I*KroneckerProduct[cx, kX]]]^2/nP; //
AbsoluteTiming // First
(* 1.22367 *)
A little more speed can be gained with this trick of using an internal function to calculate Abs[...]^2
:
s = Internal`AbsSquare[
Exp[I*KroneckerProduct[kZ, cz]].Exp[I*KroneckerProduct[cx, kX]]]/nP; //
AbsoluteTiming // First
(* 1.09222 *)
A bit more speed could be gained by replacing the Kronecker products by a NestList
, as the values in kX
and kZ
are uniformly spaced and thus the many exponentials can be replaced by repeated multiplications of a single exponential. This is numerically less stable though, and in any case at this point 50% of the time are spent in the Dot
product which is unavoidable, so the best remaining speedup you could hope for is another factor of two.
Lesson of the day: don't think about compiling before you've tried rewriting the algorithm.
mathematical derivation
Starting from Eq. (2.6) of https://arxiv.org/pdf/1001.3342.pdf
$$
S_vecq=frac1Nsum_i j cos[vecqcdot(vecr_i-vecr_j)]
= frac1NResum_i j e^texti vecqcdot(vecr_i-vecr_j)
= frac1NResum_i j e^texti vecqcdotvecr_ie^-textivecqcdotvecr_j\
= frac1NReleft[sum_i e^texti vecqcdotvecr_iright]left[sum_j e^-texti vecqcdotvecr_jright]
= frac1Nleft|sum_i e^texti vecqcdotvecr_iright|^2
= frac1Nleft|sum_i e^texti q_xr_i,xe^texti q_zr_i,zright|^2
$$
From there it's a matter of calculating all the possible combinations of $e^texti q_xr_i,x$ and $e^texti q_zr_i,z$ with all desired $vecq$-values (in kX
and kZ
), and then summing over them. These operations are most efficiently done with vector operations like KroneckerProduct
, listable Exp
, Dot
-products, and finally a listable Abs
or AbsSquare
.
$endgroup$
Table
works a little bit better than Do
:
s = Table[Total[Cos[kX[[i]]*rX + kZ[[j]]*rZ]], j, 1, lZ, i, 1, lX]/nP;
Much more speed can be gained by first rewriting the cosines as imaginary exponentials, and then rewriting exponentials of sums as products of exponentials. As a result, rX
and rZ
aren't needed, and the calculation takes 0.03 seconds, or just over one second with the large data set:
s = Abs[Exp[I*KroneckerProduct[kZ, cz]].Exp[I*KroneckerProduct[cx, kX]]]^2/nP; //
AbsoluteTiming // First
(* 1.22367 *)
A little more speed can be gained with this trick of using an internal function to calculate Abs[...]^2
:
s = Internal`AbsSquare[
Exp[I*KroneckerProduct[kZ, cz]].Exp[I*KroneckerProduct[cx, kX]]]/nP; //
AbsoluteTiming // First
(* 1.09222 *)
A bit more speed could be gained by replacing the Kronecker products by a NestList
, as the values in kX
and kZ
are uniformly spaced and thus the many exponentials can be replaced by repeated multiplications of a single exponential. This is numerically less stable though, and in any case at this point 50% of the time are spent in the Dot
product which is unavoidable, so the best remaining speedup you could hope for is another factor of two.
Lesson of the day: don't think about compiling before you've tried rewriting the algorithm.
mathematical derivation
Starting from Eq. (2.6) of https://arxiv.org/pdf/1001.3342.pdf
$$
S_vecq=frac1Nsum_i j cos[vecqcdot(vecr_i-vecr_j)]
= frac1NResum_i j e^texti vecqcdot(vecr_i-vecr_j)
= frac1NResum_i j e^texti vecqcdotvecr_ie^-textivecqcdotvecr_j\
= frac1NReleft[sum_i e^texti vecqcdotvecr_iright]left[sum_j e^-texti vecqcdotvecr_jright]
= frac1Nleft|sum_i e^texti vecqcdotvecr_iright|^2
= frac1Nleft|sum_i e^texti q_xr_i,xe^texti q_zr_i,zright|^2
$$
From there it's a matter of calculating all the possible combinations of $e^texti q_xr_i,x$ and $e^texti q_zr_i,z$ with all desired $vecq$-values (in kX
and kZ
), and then summing over them. These operations are most efficiently done with vector operations like KroneckerProduct
, listable Exp
, Dot
-products, and finally a listable Abs
or AbsSquare
.
edited May 14 at 14:21
answered May 8 at 9:18
RomanRoman
8,73511240
8,73511240
$begingroup$
Can you check if this result is equal to my s. I named your result ass2
and my ass
. And if I checks==s2
it givesFalse
. Please see also(s[[#]] == s2[[#]]) & /@ Range[lZ]
. Nevertheless this difference is neglectable.
$endgroup$
– lio
May 8 at 12:10
$begingroup$
Again, please make sure you use a fresh kernel to run calculations.
$endgroup$
– Roman
May 8 at 12:14
$begingroup$
I did it but still getFalse
(Mathematica 12.0)
$endgroup$
– lio
May 8 at 12:17
2
$begingroup$
@b3m2a1 see my updated solution: if you rewrite these cosines as exponentials, then you only need toKroneckerProduct
withcx
instead ofrX
, and the result easily fits into memory. In effect,rX
containscx
twice, and this double-use is replaced by anAbs[...]^2
call.
$endgroup$
– Roman
May 10 at 12:31
1
$begingroup$
@cphys or simplys-s2 // Abs // Max
$endgroup$
– Roman
May 15 at 20:46
|
show 12 more comments
$begingroup$
Can you check if this result is equal to my s. I named your result ass2
and my ass
. And if I checks==s2
it givesFalse
. Please see also(s[[#]] == s2[[#]]) & /@ Range[lZ]
. Nevertheless this difference is neglectable.
$endgroup$
– lio
May 8 at 12:10
$begingroup$
Again, please make sure you use a fresh kernel to run calculations.
$endgroup$
– Roman
May 8 at 12:14
$begingroup$
I did it but still getFalse
(Mathematica 12.0)
$endgroup$
– lio
May 8 at 12:17
2
$begingroup$
@b3m2a1 see my updated solution: if you rewrite these cosines as exponentials, then you only need toKroneckerProduct
withcx
instead ofrX
, and the result easily fits into memory. In effect,rX
containscx
twice, and this double-use is replaced by anAbs[...]^2
call.
$endgroup$
– Roman
May 10 at 12:31
1
$begingroup$
@cphys or simplys-s2 // Abs // Max
$endgroup$
– Roman
May 15 at 20:46
$begingroup$
Can you check if this result is equal to my s. I named your result as
s2
and my as s
. And if I check s==s2
it gives False
. Please see also (s[[#]] == s2[[#]]) & /@ Range[lZ]
. Nevertheless this difference is neglectable.$endgroup$
– lio
May 8 at 12:10
$begingroup$
Can you check if this result is equal to my s. I named your result as
s2
and my as s
. And if I check s==s2
it gives False
. Please see also (s[[#]] == s2[[#]]) & /@ Range[lZ]
. Nevertheless this difference is neglectable.$endgroup$
– lio
May 8 at 12:10
$begingroup$
Again, please make sure you use a fresh kernel to run calculations.
$endgroup$
– Roman
May 8 at 12:14
$begingroup$
Again, please make sure you use a fresh kernel to run calculations.
$endgroup$
– Roman
May 8 at 12:14
$begingroup$
I did it but still get
False
(Mathematica 12.0)$endgroup$
– lio
May 8 at 12:17
$begingroup$
I did it but still get
False
(Mathematica 12.0)$endgroup$
– lio
May 8 at 12:17
2
2
$begingroup$
@b3m2a1 see my updated solution: if you rewrite these cosines as exponentials, then you only need to
KroneckerProduct
with cx
instead of rX
, and the result easily fits into memory. In effect, rX
contains cx
twice, and this double-use is replaced by an Abs[...]^2
call.$endgroup$
– Roman
May 10 at 12:31
$begingroup$
@b3m2a1 see my updated solution: if you rewrite these cosines as exponentials, then you only need to
KroneckerProduct
with cx
instead of rX
, and the result easily fits into memory. In effect, rX
contains cx
twice, and this double-use is replaced by an Abs[...]^2
call.$endgroup$
– Roman
May 10 at 12:31
1
1
$begingroup$
@cphys or simply
s-s2 // Abs // Max
$endgroup$
– Roman
May 15 at 20:46
$begingroup$
@cphys or simply
s-s2 // Abs // Max
$endgroup$
– Roman
May 15 at 20:46
|
show 12 more comments
Thanks for contributing an answer to Mathematica Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathematica.stackexchange.com%2fquestions%2f197924%2freplacement-of-do-loops%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
$begingroup$
Can you add information about what this code is supposed to do to your question?
$endgroup$
– Carl Lange
May 8 at 8:17
$begingroup$
@Carl Lange: It calculates the structures factor of a crystaline structure (S(k) algorithm from a paper by Henrich et al., arXiv:1001.3342v1, equation 2.6, pdf file: arxiv.org/pdf/1001.3342.pdf)
$endgroup$
– lio
May 8 at 8:38
1
$begingroup$
On my computer your first (non-parallel) code only takes 14.4 seconds. Replacing the last double loop with
s = Outer[Total[Cos[#1 + #2]] &, KroneckerProduct[kZ, rZ], KroneckerProduct[kX, rX], 1]/nP
cuts this down to 8.6 seconds. I have no idea where your excessive runtime of 3714.41 seconds comes from.$endgroup$
– Roman
May 8 at 8:51
1
$begingroup$
I agree with @Roman, on my computer your code took about 14s as well.
$endgroup$
– Turgon
May 8 at 8:52
2
$begingroup$
Have you run your own code in a fresh kernel? What is your Mathematica version? Have you tried replacing the cosine sum by a
Fourier
orFourierDCT
call?$endgroup$
– Roman
May 8 at 9:01