C
Function estadisticas you have to call with a x and one and are lists with numerical values.A serious example:import pandas as pd
import numpy as np
columnas = ['nombre', 'Matematicas', 'Ciencias', 'Espanol', 'Historia', 'EdFisica']
lineas = [['Lucia ', 7.0, 6.5, 9.2, 8.6, 8.0],
['Pedro ', 7.5, 9.4, 7.3, 7.0, 7.0],
['Ines ', 7.6, 9.2, 8.0, 8.0, 7.5],
['Luis ', 5.0, 6.5, 6.5, 7.0, 9.0],
['Andres', 6.0, 6.0, 7.8, 8.9, 7.3],
['Ana ', 7.8, 9.6, 7.7, 8.0, 6.5],
['Carlos', 6.3, 6.4, 8.2, 9.0, 7.2],
['Jose ', 7.9, 9.7, 7.5, 8.0, 6.0],
['Sonia ', 6.0, 6.0, 6.5, 5.5, 8.7],
['Maria ', 6.8, 7.2, 8.7, 9.0, 7.0]]
df = pd.DataFrame(columns=columnas, data=lineas)
df.set_index('nombre', inplace=True)
y = df['Espanol']
x = df['Matematicas']
def estadisticas(x, y):
return {'Variable_1': x,
'Variable_2': y,
'Correlacion': np.corrcoef(x, y),
'Covarianza': np.cov(x, y)}
estadisticas(x, y)
To work with the dataframe, you can write the function with a dataframe and the names of two columns. And to compare every 2 columns, https://docs.python.org/3/library/itertools.html is a comfortable way to catch each combination:import itertools
def estadisticas(df, col_x, col_y):
return {'Variable_1': col_x,
'Variable_2': col_y,
'Correlacion': np.corrcoef(df[col_x], df[col_y]),
'Covarianza': np.cov(df[col_x], df[col_y])}
for col_x, col_y in itertools.combinations(df.columns, 2):
print(estadisticas(df, col_x, col_y ))
The most comfortable thing to visualize is the function https://seaborn.pydata.org/generated/seaborn.pairplot.html of the bookstore seaborn:from matplotlib import pyplot as plt
import seaborn as sns
df = ....
g = sns.pairplot(df)
plt.tight_layout()
plt.show()
You'd get: