Is numpy.corrcoef() enough to find correlation?How to get correlation between two categorical variable and a categorical variable and continuous variable?Find correlation in observed dataUse of Correlation scoreCorrelation and feature selectionExporting Correlation Matrix (from function)How decision trees work in PythonHow to interpret partial dependence interaction plot for binary classification?How to select variables based on the mean correlation in a correlation matrix?How to approach a machine learning problem?Why is my Seaborn distplot creating bouncing lines instead of smooth lines?What measures can I use to find correlation between categorical features and binary label?
What is the object moving across the ceiling in this stock footage?
Why are these traces shaped in such way?
Why do Russians call their women expensive ("дорогая")?
How did early x86 BIOS programmers manage to program full blown TUIs given very few bytes of ROM/EPROM?
Python program to convert a 24 hour format to 12 hour format
Can't remember the name of this game
Why colon to denote that a value belongs to a type?
What is the 中 in ダウンロード中?
How to prevent bad sectors?
Would jet fuel for an F-16 or F-35 be producible during WW2?
How were these pictures of spacecraft wind tunnel testing taken?
Why are C64 games inconsistent with which joystick port they use?
Is there a general effective method to solve Smullyan style Knights and Knaves problems? Is the truth table method the most appropriate one?
ESTA/WVP - leaving US within 90 days, then staying in DR
Is the first derivative operation on a signal a causal system?
Can you heal a summoned creature?
How do I subvert the tropes of a train heist?
Why is desire the root of suffering?
What does the view outside my ship traveling at light speed look like?
Where is the encrypted mask value?
Placing bypass capacitors after VCC reaches the IC
Command to Search for Filenames Exceeding 143 Characters?
Windows 10 Programs start without visual Interface
How can people dance around bonfires on Lag Lo'Omer - it's darchei emori?
Is numpy.corrcoef() enough to find correlation?
How to get correlation between two categorical variable and a categorical variable and continuous variable?Find correlation in observed dataUse of Correlation scoreCorrelation and feature selectionExporting Correlation Matrix (from function)How decision trees work in PythonHow to interpret partial dependence interaction plot for binary classification?How to select variables based on the mean correlation in a correlation matrix?How to approach a machine learning problem?Why is my Seaborn distplot creating bouncing lines instead of smooth lines?What measures can I use to find correlation between categorical features and binary label?
$begingroup$
I am currently working through Kaggle's titanic competition and I'm trying to figure out the correlation between the Survived
column and other columns. I am using numpy.corrcoef()
to matrix the correlation between the columns and here is what I have:
The correlation between pClass & Survived is: [[ 1. -0.33848104]
[-0.33848104 1. ]]
The correlation between Sex & Survived is: [[ 1. -0.54335138]
[-0.54335138 1. ]]
The correlation between Age & Survived is:[[ 1. -0.07065723]
[-0.07065723 1. ]]
The correlation between Fare & Survived is: [[1. 0.25730652]
[0.25730652 1. ]]
The correlation between Parent-Children & Survived is: [[1. 0.08162941]
[0.08162941 1. ]]
The correlation between Sibling-Spouse & Survived is: [[ 1. -0.0353225]
[-0.0353225 1. ]]
The correlation between Embarked & Survived is: [[ 1. -0.16767531]
[-0.16767531 1. ]]
There should be higher correlation between Survived
and [pClass
, sex
, Sibling-Spouse
] and yet the values are really low. I'm new to this so I understand that a simple method is not the best way to find correlations but at the moment, this doesn't add up.
This is my full code (without the printf()
calls):
import pandas as pd
import numpy as np
train = pd.read_csv("https://raw.githubusercontent.com/oo92/Titanic-Kaggle/master/train.csv")
test = pd.read_csv("https://raw.githubusercontent.com/oo92/Titanic-Kaggle/master/test.csv")
survived = train['Survived']
pClass = train['Pclass']
sex = train['Sex'].replace(['female', 'male'], [0, 1])
age = train['Age'].fillna(round(float(np.mean(train['Age'].dropna()))))
fare = train['Fare']
parch = train['Parch']
sibSp = train['SibSp']
embarked = train['Embarked'].replace(['C', 'Q', 'S'], [1, 2, 3])
machine-learning python feature-selection numpy kaggle
$endgroup$
add a comment |
$begingroup$
I am currently working through Kaggle's titanic competition and I'm trying to figure out the correlation between the Survived
column and other columns. I am using numpy.corrcoef()
to matrix the correlation between the columns and here is what I have:
The correlation between pClass & Survived is: [[ 1. -0.33848104]
[-0.33848104 1. ]]
The correlation between Sex & Survived is: [[ 1. -0.54335138]
[-0.54335138 1. ]]
The correlation between Age & Survived is:[[ 1. -0.07065723]
[-0.07065723 1. ]]
The correlation between Fare & Survived is: [[1. 0.25730652]
[0.25730652 1. ]]
The correlation between Parent-Children & Survived is: [[1. 0.08162941]
[0.08162941 1. ]]
The correlation between Sibling-Spouse & Survived is: [[ 1. -0.0353225]
[-0.0353225 1. ]]
The correlation between Embarked & Survived is: [[ 1. -0.16767531]
[-0.16767531 1. ]]
There should be higher correlation between Survived
and [pClass
, sex
, Sibling-Spouse
] and yet the values are really low. I'm new to this so I understand that a simple method is not the best way to find correlations but at the moment, this doesn't add up.
This is my full code (without the printf()
calls):
import pandas as pd
import numpy as np
train = pd.read_csv("https://raw.githubusercontent.com/oo92/Titanic-Kaggle/master/train.csv")
test = pd.read_csv("https://raw.githubusercontent.com/oo92/Titanic-Kaggle/master/test.csv")
survived = train['Survived']
pClass = train['Pclass']
sex = train['Sex'].replace(['female', 'male'], [0, 1])
age = train['Age'].fillna(round(float(np.mean(train['Age'].dropna()))))
fare = train['Fare']
parch = train['Parch']
sibSp = train['SibSp']
embarked = train['Embarked'].replace(['C', 'Q', 'S'], [1, 2, 3])
machine-learning python feature-selection numpy kaggle
$endgroup$
$begingroup$
why do you think the values should be higher?
$endgroup$
– nairboon
May 14 at 9:56
$begingroup$
Because there is a strong correlation between sex, class and survival. Women and rich passengers were most likely to survive.
$endgroup$
– Atilla Adrianopolos
May 14 at 9:59
add a comment |
$begingroup$
I am currently working through Kaggle's titanic competition and I'm trying to figure out the correlation between the Survived
column and other columns. I am using numpy.corrcoef()
to matrix the correlation between the columns and here is what I have:
The correlation between pClass & Survived is: [[ 1. -0.33848104]
[-0.33848104 1. ]]
The correlation between Sex & Survived is: [[ 1. -0.54335138]
[-0.54335138 1. ]]
The correlation between Age & Survived is:[[ 1. -0.07065723]
[-0.07065723 1. ]]
The correlation between Fare & Survived is: [[1. 0.25730652]
[0.25730652 1. ]]
The correlation between Parent-Children & Survived is: [[1. 0.08162941]
[0.08162941 1. ]]
The correlation between Sibling-Spouse & Survived is: [[ 1. -0.0353225]
[-0.0353225 1. ]]
The correlation between Embarked & Survived is: [[ 1. -0.16767531]
[-0.16767531 1. ]]
There should be higher correlation between Survived
and [pClass
, sex
, Sibling-Spouse
] and yet the values are really low. I'm new to this so I understand that a simple method is not the best way to find correlations but at the moment, this doesn't add up.
This is my full code (without the printf()
calls):
import pandas as pd
import numpy as np
train = pd.read_csv("https://raw.githubusercontent.com/oo92/Titanic-Kaggle/master/train.csv")
test = pd.read_csv("https://raw.githubusercontent.com/oo92/Titanic-Kaggle/master/test.csv")
survived = train['Survived']
pClass = train['Pclass']
sex = train['Sex'].replace(['female', 'male'], [0, 1])
age = train['Age'].fillna(round(float(np.mean(train['Age'].dropna()))))
fare = train['Fare']
parch = train['Parch']
sibSp = train['SibSp']
embarked = train['Embarked'].replace(['C', 'Q', 'S'], [1, 2, 3])
machine-learning python feature-selection numpy kaggle
$endgroup$
I am currently working through Kaggle's titanic competition and I'm trying to figure out the correlation between the Survived
column and other columns. I am using numpy.corrcoef()
to matrix the correlation between the columns and here is what I have:
The correlation between pClass & Survived is: [[ 1. -0.33848104]
[-0.33848104 1. ]]
The correlation between Sex & Survived is: [[ 1. -0.54335138]
[-0.54335138 1. ]]
The correlation between Age & Survived is:[[ 1. -0.07065723]
[-0.07065723 1. ]]
The correlation between Fare & Survived is: [[1. 0.25730652]
[0.25730652 1. ]]
The correlation between Parent-Children & Survived is: [[1. 0.08162941]
[0.08162941 1. ]]
The correlation between Sibling-Spouse & Survived is: [[ 1. -0.0353225]
[-0.0353225 1. ]]
The correlation between Embarked & Survived is: [[ 1. -0.16767531]
[-0.16767531 1. ]]
There should be higher correlation between Survived
and [pClass
, sex
, Sibling-Spouse
] and yet the values are really low. I'm new to this so I understand that a simple method is not the best way to find correlations but at the moment, this doesn't add up.
This is my full code (without the printf()
calls):
import pandas as pd
import numpy as np
train = pd.read_csv("https://raw.githubusercontent.com/oo92/Titanic-Kaggle/master/train.csv")
test = pd.read_csv("https://raw.githubusercontent.com/oo92/Titanic-Kaggle/master/test.csv")
survived = train['Survived']
pClass = train['Pclass']
sex = train['Sex'].replace(['female', 'male'], [0, 1])
age = train['Age'].fillna(round(float(np.mean(train['Age'].dropna()))))
fare = train['Fare']
parch = train['Parch']
sibSp = train['SibSp']
embarked = train['Embarked'].replace(['C', 'Q', 'S'], [1, 2, 3])
machine-learning python feature-selection numpy kaggle
machine-learning python feature-selection numpy kaggle
edited May 14 at 13:45
Juan Esteban de la Calle
1,363324
1,363324
asked May 14 at 9:45
Atilla AdrianopolosAtilla Adrianopolos
1134
1134
$begingroup$
why do you think the values should be higher?
$endgroup$
– nairboon
May 14 at 9:56
$begingroup$
Because there is a strong correlation between sex, class and survival. Women and rich passengers were most likely to survive.
$endgroup$
– Atilla Adrianopolos
May 14 at 9:59
add a comment |
$begingroup$
why do you think the values should be higher?
$endgroup$
– nairboon
May 14 at 9:56
$begingroup$
Because there is a strong correlation between sex, class and survival. Women and rich passengers were most likely to survive.
$endgroup$
– Atilla Adrianopolos
May 14 at 9:59
$begingroup$
why do you think the values should be higher?
$endgroup$
– nairboon
May 14 at 9:56
$begingroup$
why do you think the values should be higher?
$endgroup$
– nairboon
May 14 at 9:56
$begingroup$
Because there is a strong correlation between sex, class and survival. Women and rich passengers were most likely to survive.
$endgroup$
– Atilla Adrianopolos
May 14 at 9:59
$begingroup$
Because there is a strong correlation between sex, class and survival. Women and rich passengers were most likely to survive.
$endgroup$
– Atilla Adrianopolos
May 14 at 9:59
add a comment |
2 Answers
2
active
oldest
votes
$begingroup$
On a side note, I don't think correlation is the correct measure of relation for you to be using, since Survived
is technically a binary categorical variable.
"Correlation" measures used should depend on the type of variables being investigated:
- continuous variable v continuous variable: use "traditional" correlation - e.g. Spearman's rank correlation or Pearson's linear correlation.
- continuous variable v categorical variable: use an ANOVA F-test / difference of means
- categorical variable v categorical variable: use Chi-square / Cramer's V
$endgroup$
1
$begingroup$
Here is a closely related old post.
$endgroup$
– Esmailian
May 18 at 15:29
$begingroup$
@bradS When you sayANOVA F-test/difference of means
, do you mean dividing ANOVA F-test by difference of means?
$endgroup$
– Atilla Adrianopolos
May 19 at 17:50
$begingroup$
@AtillaAdrianopolos, no I mean "/" as "or". Using item 3 above as an example, use Chi-square test of independence or Cramer's V.
$endgroup$
– bradS
May 20 at 8:09
add a comment |
$begingroup$
You probably encoded Women as 0 and men as 1 that's why you get a negative correlation of -0.54, because Survived is 0 for No and 1 for Yes. Your calculation actually show what you've expected. The negative correlation is only about the direction depending on your encoding, the relationship between Women and Survived is 0.54.
Similarly pClass is correlated negatively with -0.33 because the highest class (1st class) is encoded as 1 and the lowest as 3, thus the direction is negative.
You could make the relations more intuitive if you make new columns for men and women where you put 0 and 1 depending on the sex, then the correlations will have the intuitive direction (sign). The same holds for pClass.
$endgroup$
$begingroup$
I've added my code.
$endgroup$
– Atilla Adrianopolos
May 14 at 10:14
$begingroup$
What if I encodemale/female
with3/4
instead? They're still binary values and just might solve the problem you're raisng.
$endgroup$
– Atilla Adrianopolos
May 14 at 10:15
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f51935%2fis-numpy-corrcoef-enough-to-find-correlation%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
On a side note, I don't think correlation is the correct measure of relation for you to be using, since Survived
is technically a binary categorical variable.
"Correlation" measures used should depend on the type of variables being investigated:
- continuous variable v continuous variable: use "traditional" correlation - e.g. Spearman's rank correlation or Pearson's linear correlation.
- continuous variable v categorical variable: use an ANOVA F-test / difference of means
- categorical variable v categorical variable: use Chi-square / Cramer's V
$endgroup$
1
$begingroup$
Here is a closely related old post.
$endgroup$
– Esmailian
May 18 at 15:29
$begingroup$
@bradS When you sayANOVA F-test/difference of means
, do you mean dividing ANOVA F-test by difference of means?
$endgroup$
– Atilla Adrianopolos
May 19 at 17:50
$begingroup$
@AtillaAdrianopolos, no I mean "/" as "or". Using item 3 above as an example, use Chi-square test of independence or Cramer's V.
$endgroup$
– bradS
May 20 at 8:09
add a comment |
$begingroup$
On a side note, I don't think correlation is the correct measure of relation for you to be using, since Survived
is technically a binary categorical variable.
"Correlation" measures used should depend on the type of variables being investigated:
- continuous variable v continuous variable: use "traditional" correlation - e.g. Spearman's rank correlation or Pearson's linear correlation.
- continuous variable v categorical variable: use an ANOVA F-test / difference of means
- categorical variable v categorical variable: use Chi-square / Cramer's V
$endgroup$
1
$begingroup$
Here is a closely related old post.
$endgroup$
– Esmailian
May 18 at 15:29
$begingroup$
@bradS When you sayANOVA F-test/difference of means
, do you mean dividing ANOVA F-test by difference of means?
$endgroup$
– Atilla Adrianopolos
May 19 at 17:50
$begingroup$
@AtillaAdrianopolos, no I mean "/" as "or". Using item 3 above as an example, use Chi-square test of independence or Cramer's V.
$endgroup$
– bradS
May 20 at 8:09
add a comment |
$begingroup$
On a side note, I don't think correlation is the correct measure of relation for you to be using, since Survived
is technically a binary categorical variable.
"Correlation" measures used should depend on the type of variables being investigated:
- continuous variable v continuous variable: use "traditional" correlation - e.g. Spearman's rank correlation or Pearson's linear correlation.
- continuous variable v categorical variable: use an ANOVA F-test / difference of means
- categorical variable v categorical variable: use Chi-square / Cramer's V
$endgroup$
On a side note, I don't think correlation is the correct measure of relation for you to be using, since Survived
is technically a binary categorical variable.
"Correlation" measures used should depend on the type of variables being investigated:
- continuous variable v continuous variable: use "traditional" correlation - e.g. Spearman's rank correlation or Pearson's linear correlation.
- continuous variable v categorical variable: use an ANOVA F-test / difference of means
- categorical variable v categorical variable: use Chi-square / Cramer's V
answered May 14 at 11:07
bradSbradS
783214
783214
1
$begingroup$
Here is a closely related old post.
$endgroup$
– Esmailian
May 18 at 15:29
$begingroup$
@bradS When you sayANOVA F-test/difference of means
, do you mean dividing ANOVA F-test by difference of means?
$endgroup$
– Atilla Adrianopolos
May 19 at 17:50
$begingroup$
@AtillaAdrianopolos, no I mean "/" as "or". Using item 3 above as an example, use Chi-square test of independence or Cramer's V.
$endgroup$
– bradS
May 20 at 8:09
add a comment |
1
$begingroup$
Here is a closely related old post.
$endgroup$
– Esmailian
May 18 at 15:29
$begingroup$
@bradS When you sayANOVA F-test/difference of means
, do you mean dividing ANOVA F-test by difference of means?
$endgroup$
– Atilla Adrianopolos
May 19 at 17:50
$begingroup$
@AtillaAdrianopolos, no I mean "/" as "or". Using item 3 above as an example, use Chi-square test of independence or Cramer's V.
$endgroup$
– bradS
May 20 at 8:09
1
1
$begingroup$
Here is a closely related old post.
$endgroup$
– Esmailian
May 18 at 15:29
$begingroup$
Here is a closely related old post.
$endgroup$
– Esmailian
May 18 at 15:29
$begingroup$
@bradS When you say
ANOVA F-test/difference of means
, do you mean dividing ANOVA F-test by difference of means?$endgroup$
– Atilla Adrianopolos
May 19 at 17:50
$begingroup$
@bradS When you say
ANOVA F-test/difference of means
, do you mean dividing ANOVA F-test by difference of means?$endgroup$
– Atilla Adrianopolos
May 19 at 17:50
$begingroup$
@AtillaAdrianopolos, no I mean "/" as "or". Using item 3 above as an example, use Chi-square test of independence or Cramer's V.
$endgroup$
– bradS
May 20 at 8:09
$begingroup$
@AtillaAdrianopolos, no I mean "/" as "or". Using item 3 above as an example, use Chi-square test of independence or Cramer's V.
$endgroup$
– bradS
May 20 at 8:09
add a comment |
$begingroup$
You probably encoded Women as 0 and men as 1 that's why you get a negative correlation of -0.54, because Survived is 0 for No and 1 for Yes. Your calculation actually show what you've expected. The negative correlation is only about the direction depending on your encoding, the relationship between Women and Survived is 0.54.
Similarly pClass is correlated negatively with -0.33 because the highest class (1st class) is encoded as 1 and the lowest as 3, thus the direction is negative.
You could make the relations more intuitive if you make new columns for men and women where you put 0 and 1 depending on the sex, then the correlations will have the intuitive direction (sign). The same holds for pClass.
$endgroup$
$begingroup$
I've added my code.
$endgroup$
– Atilla Adrianopolos
May 14 at 10:14
$begingroup$
What if I encodemale/female
with3/4
instead? They're still binary values and just might solve the problem you're raisng.
$endgroup$
– Atilla Adrianopolos
May 14 at 10:15
add a comment |
$begingroup$
You probably encoded Women as 0 and men as 1 that's why you get a negative correlation of -0.54, because Survived is 0 for No and 1 for Yes. Your calculation actually show what you've expected. The negative correlation is only about the direction depending on your encoding, the relationship between Women and Survived is 0.54.
Similarly pClass is correlated negatively with -0.33 because the highest class (1st class) is encoded as 1 and the lowest as 3, thus the direction is negative.
You could make the relations more intuitive if you make new columns for men and women where you put 0 and 1 depending on the sex, then the correlations will have the intuitive direction (sign). The same holds for pClass.
$endgroup$
$begingroup$
I've added my code.
$endgroup$
– Atilla Adrianopolos
May 14 at 10:14
$begingroup$
What if I encodemale/female
with3/4
instead? They're still binary values and just might solve the problem you're raisng.
$endgroup$
– Atilla Adrianopolos
May 14 at 10:15
add a comment |
$begingroup$
You probably encoded Women as 0 and men as 1 that's why you get a negative correlation of -0.54, because Survived is 0 for No and 1 for Yes. Your calculation actually show what you've expected. The negative correlation is only about the direction depending on your encoding, the relationship between Women and Survived is 0.54.
Similarly pClass is correlated negatively with -0.33 because the highest class (1st class) is encoded as 1 and the lowest as 3, thus the direction is negative.
You could make the relations more intuitive if you make new columns for men and women where you put 0 and 1 depending on the sex, then the correlations will have the intuitive direction (sign). The same holds for pClass.
$endgroup$
You probably encoded Women as 0 and men as 1 that's why you get a negative correlation of -0.54, because Survived is 0 for No and 1 for Yes. Your calculation actually show what you've expected. The negative correlation is only about the direction depending on your encoding, the relationship between Women and Survived is 0.54.
Similarly pClass is correlated negatively with -0.33 because the highest class (1st class) is encoded as 1 and the lowest as 3, thus the direction is negative.
You could make the relations more intuitive if you make new columns for men and women where you put 0 and 1 depending on the sex, then the correlations will have the intuitive direction (sign). The same holds for pClass.
edited May 14 at 13:16
Stephen Rauch♦
1,51361330
1,51361330
answered May 14 at 10:10
nairboonnairboon
1132
1132
$begingroup$
I've added my code.
$endgroup$
– Atilla Adrianopolos
May 14 at 10:14
$begingroup$
What if I encodemale/female
with3/4
instead? They're still binary values and just might solve the problem you're raisng.
$endgroup$
– Atilla Adrianopolos
May 14 at 10:15
add a comment |
$begingroup$
I've added my code.
$endgroup$
– Atilla Adrianopolos
May 14 at 10:14
$begingroup$
What if I encodemale/female
with3/4
instead? They're still binary values and just might solve the problem you're raisng.
$endgroup$
– Atilla Adrianopolos
May 14 at 10:15
$begingroup$
I've added my code.
$endgroup$
– Atilla Adrianopolos
May 14 at 10:14
$begingroup$
I've added my code.
$endgroup$
– Atilla Adrianopolos
May 14 at 10:14
$begingroup$
What if I encode
male/female
with 3/4
instead? They're still binary values and just might solve the problem you're raisng.$endgroup$
– Atilla Adrianopolos
May 14 at 10:15
$begingroup$
What if I encode
male/female
with 3/4
instead? They're still binary values and just might solve the problem you're raisng.$endgroup$
– Atilla Adrianopolos
May 14 at 10:15
add a comment |
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f51935%2fis-numpy-corrcoef-enough-to-find-correlation%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
why do you think the values should be higher?
$endgroup$
– nairboon
May 14 at 9:56
$begingroup$
Because there is a strong correlation between sex, class and survival. Women and rich passengers were most likely to survive.
$endgroup$
– Atilla Adrianopolos
May 14 at 9:59