In [22]:
import pandas as pd
import statsmodels.api as sm
import seaborn as sns
import matplotlib.pyplot as plt

# Create a DataFrame from the provided Excel file
df = pd.read_excel("Facebook Friends.xlsx")

# Create a subset DataFrame with non-binary numerical variables
subset_df = df[['Age', 'Photos', '# of Tags', 'Albums', 'Posts', 'Replies', 'Children', 'Likes', 'Edu', 'Events', 'Friends']]

# Calculate the correlation matrix
correlation_matrix = subset_df.corr()

# Create a heatmap of the correlation matrix
plt.figure(figsize=(10, 8))
sns.heatmap(correlation_matrix, annot=True, cmap="coolwarm", linewidths=.5)
plt.title("Correlation Heatmap")
plt.show()

# Perform single-variable linear regression analysis for each variable
for column in subset_df.columns:
    X = sm.add_constant(df[column])
    y = df['Friends']  # Friends as the independent variable
    model = sm.OLS(y, X).fit()
    print(f"Regression Analysis for {column}:")
    print(model.summary())
    print("\n")
Regression Analysis for Age:
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                Friends   R-squared:                       0.039
Model:                            OLS   Adj. R-squared:                  0.038
Method:                 Least Squares   F-statistic:                     29.27
Date:                Sun, 01 Oct 2023   Prob (F-statistic):           8.61e-08
Time:                        17:48:08   Log-Likelihood:                -5572.9
No. Observations:                 715   AIC:                         1.115e+04
Df Residuals:                     713   BIC:                         1.116e+04
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const       1112.7647     80.137     13.886      0.000     955.431    1270.098
Age          -17.0816      3.157     -5.410      0.000     -23.280     -10.883
==============================================================================
Omnibus:                      511.614   Durbin-Watson:                   1.641
Prob(Omnibus):                  0.000   Jarque-Bera (JB):             8730.929
Skew:                           3.036   Prob(JB):                         0.00
Kurtosis:                      19.006   Cond. No.                         92.6
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.


Regression Analysis for Photos:
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                Friends   R-squared:                       0.060
Model:                            OLS   Adj. R-squared:                  0.059
Method:                 Least Squares   F-statistic:                     45.79
Date:                Sun, 01 Oct 2023   Prob (F-statistic):           2.75e-11
Time:                        17:48:08   Log-Likelihood:                -5565.0
No. Observations:                 715   AIC:                         1.113e+04
Df Residuals:                     713   BIC:                         1.114e+04
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const        611.5835     25.063     24.402      0.000     562.378     660.789
Photos         0.1164      0.017      6.766      0.000       0.083       0.150
==============================================================================
Omnibus:                      528.533   Durbin-Watson:                   1.639
Prob(Omnibus):                  0.000   Jarque-Bera (JB):            10056.607
Skew:                           3.139   Prob(JB):                         0.00
Kurtosis:                      20.267   Cond. No.                     1.68e+03
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The condition number is large, 1.68e+03. This might indicate that there are
strong multicollinearity or other numerical problems.


Regression Analysis for # of Tags:
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                Friends   R-squared:                       0.057
Model:                            OLS   Adj. R-squared:                  0.055
Method:                 Least Squares   F-statistic:                     42.87
Date:                Sun, 01 Oct 2023   Prob (F-statistic):           1.12e-10
Time:                        17:48:08   Log-Likelihood:                -5566.4
No. Observations:                 715   AIC:                         1.114e+04
Df Residuals:                     713   BIC:                         1.115e+04
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const        605.1193     25.825     23.432      0.000     554.418     655.821
# of Tags      0.1981      0.030      6.547      0.000       0.139       0.257
==============================================================================
Omnibus:                      519.930   Durbin-Watson:                   1.636
Prob(Omnibus):                  0.000   Jarque-Bera (JB):             9709.290
Skew:                           3.071   Prob(JB):                         0.00
Kurtosis:                      19.976   Cond. No.                     1.01e+03
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The condition number is large, 1.01e+03. This might indicate that there are
strong multicollinearity or other numerical problems.


Regression Analysis for Albums:
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                Friends   R-squared:                       0.052
Model:                            OLS   Adj. R-squared:                  0.051
Method:                 Least Squares   F-statistic:                     39.25
Date:                Sun, 01 Oct 2023   Prob (F-statistic):           6.45e-10
Time:                        17:48:08   Log-Likelihood:                -5568.1
No. Observations:                 715   AIC:                         1.114e+04
Df Residuals:                     713   BIC:                         1.115e+04
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const        580.5135     28.567     20.321      0.000     524.428     636.599
Albums         6.0857      0.971      6.265      0.000       4.179       7.993
==============================================================================
Omnibus:                      525.737   Durbin-Watson:                   1.672
Prob(Omnibus):                  0.000   Jarque-Bera (JB):             9573.127
Skew:                           3.134   Prob(JB):                         0.00
Kurtosis:                      19.795   Cond. No.                         38.5
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.


Regression Analysis for Posts:
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                Friends   R-squared:                       0.002
Model:                            OLS   Adj. R-squared:                  0.001
Method:                 Least Squares   F-statistic:                     1.723
Date:                Sun, 01 Oct 2023   Prob (F-statistic):              0.190
Time:                        17:48:08   Log-Likelihood:                -5586.4
No. Observations:                 715   AIC:                         1.118e+04
Df Residuals:                     713   BIC:                         1.119e+04
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const        683.6227     24.270     28.168      0.000     635.974     731.272
Posts          0.3255      0.248      1.313      0.190      -0.161       0.812
==============================================================================
Omnibus:                      494.645   Durbin-Watson:                   1.621
Prob(Omnibus):                  0.000   Jarque-Bera (JB):             7580.646
Skew:                           2.935   Prob(JB):                         0.00
Kurtosis:                      17.832   Cond. No.                         106.
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.


Regression Analysis for Replies:
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                Friends   R-squared:                       0.001
Model:                            OLS   Adj. R-squared:                 -0.000
Method:                 Least Squares   F-statistic:                    0.6610
Date:                Sun, 01 Oct 2023   Prob (F-statistic):              0.416
Time:                        17:48:08   Log-Likelihood:                -5587.0
No. Observations:                 715   AIC:                         1.118e+04
Df Residuals:                     713   BIC:                         1.119e+04
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const        688.2570     24.295     28.329      0.000     640.559     735.955
Replies        0.2182      0.268      0.813      0.416      -0.309       0.745
==============================================================================
Omnibus:                      494.853   Durbin-Watson:                   1.620
Prob(Omnibus):                  0.000   Jarque-Bera (JB):             7609.023
Skew:                           2.935   Prob(JB):                         0.00
Kurtosis:                      17.864   Cond. No.                         98.1
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.


Regression Analysis for Children:
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                Friends   R-squared:                       0.027
Model:                            OLS   Adj. R-squared:                  0.025
Method:                 Least Squares   F-statistic:                     19.62
Date:                Sun, 01 Oct 2023   Prob (F-statistic):           1.09e-05
Time:                        17:48:08   Log-Likelihood:                -5577.6
No. Observations:                 715   AIC:                         1.116e+04
Df Residuals:                     713   BIC:                         1.117e+04
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const        727.6564     23.270     31.271      0.000     681.971     773.341
Children    -150.5914     33.995     -4.430      0.000    -217.334     -83.849
==============================================================================
Omnibus:                      500.357   Durbin-Watson:                   1.648
Prob(Omnibus):                  0.000   Jarque-Bera (JB):             7956.183
Skew:                           2.969   Prob(JB):                         0.00
Kurtosis:                      18.225   Cond. No.                         1.65
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.


Regression Analysis for Likes:
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                Friends   R-squared:                       0.054
Model:                            OLS   Adj. R-squared:                  0.052
Method:                 Least Squares   F-statistic:                     40.56
Date:                Sun, 01 Oct 2023   Prob (F-statistic):           3.42e-10
Time:                        17:48:08   Log-Likelihood:                -5567.5
No. Observations:                 715   AIC:                         1.114e+04
Df Residuals:                     713   BIC:                         1.115e+04
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const        618.8046     24.954     24.798      0.000     569.813     667.796
Likes          0.5321      0.084      6.369      0.000       0.368       0.696
==============================================================================
Omnibus:                      456.035   Durbin-Watson:                   1.635
Prob(Omnibus):                  0.000   Jarque-Bera (JB):             5831.785
Skew:                           2.685   Prob(JB):                         0.00
Kurtosis:                      15.920   Cond. No.                         342.
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.


Regression Analysis for Edu:
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                Friends   R-squared:                       0.002
Model:                            OLS   Adj. R-squared:                  0.000
Method:                 Least Squares   F-statistic:                     1.351
Date:                Sun, 01 Oct 2023   Prob (F-statistic):              0.246
Time:                        17:48:08   Log-Likelihood:                -5586.6
No. Observations:                 715   AIC:                         1.118e+04
Df Residuals:                     713   BIC:                         1.119e+04
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const        711.5821     26.184     27.176      0.000     660.175     762.989
Edu          -58.8805     50.661     -1.162      0.246    -158.343      40.582
==============================================================================
Omnibus:                      497.249   Durbin-Watson:                   1.628
Prob(Omnibus):                  0.000   Jarque-Bera (JB):             7766.707
Skew:                           2.949   Prob(JB):                         0.00
Kurtosis:                      18.030   Cond. No.                         2.46
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.


Regression Analysis for Events:
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                Friends   R-squared:                       0.004
Model:                            OLS   Adj. R-squared:                  0.003
Method:                 Least Squares   F-statistic:                     3.057
Date:                Sun, 01 Oct 2023   Prob (F-statistic):             0.0808
Time:                        17:48:08   Log-Likelihood:                -5585.8
No. Observations:                 715   AIC:                         1.118e+04
Df Residuals:                     713   BIC:                         1.118e+04
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const        681.4055     23.865     28.552      0.000     634.551     728.260
Events         1.6319      0.933      1.748      0.081      -0.201       3.465
==============================================================================
Omnibus:                      491.715   Durbin-Watson:                   1.621
Prob(Omnibus):                  0.000   Jarque-Bera (JB):             7522.138
Skew:                           2.910   Prob(JB):                         0.00
Kurtosis:                      17.785   Cond. No.                         27.3
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.


Regression Analysis for Friends:
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                Friends   R-squared:                       1.000
Model:                            OLS   Adj. R-squared:                  1.000
Method:                 Least Squares   F-statistic:                 2.299e+33
Date:                Sun, 01 Oct 2023   Prob (F-statistic):               0.00
Time:                        17:48:08   Log-Likelihood:                 19526.
No. Observations:                 715   AIC:                        -3.905e+04
Df Residuals:                     713   BIC:                        -3.904e+04
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const      -2.132e-14   1.92e-14     -1.113      0.266   -5.89e-14    1.63e-14
Friends        1.0000   2.09e-17   4.79e+16      0.000       1.000       1.000
==============================================================================
Omnibus:                      522.883   Durbin-Watson:                   0.689
Prob(Omnibus):                  0.000   Jarque-Bera (JB):             9335.266
Skew:                           3.117   Prob(JB):                         0.00
Kurtosis:                      19.568   Cond. No.                     1.41e+03
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The condition number is large, 1.41e+03. This might indicate that there are
strong multicollinearity or other numerical problems.


In [23]:
import pandas as pd
import statsmodels.api as sm

# Create a DataFrame from the provided Excel file
df = pd.read_excel("Facebook Friends.xlsx")

# Define the dependent variable and independent variables
dependent_variable = df['Friends']
independent_variables = df[['Age', 'Photos', '# of Tags', 'Albums', 'Posts', 'Replies', 'Children', 'Likes', 'Edu', 'Events']]

# Add a constant (intercept) to the independent variables
independent_variables = sm.add_constant(independent_variables)

# Perform multivariate linear regression
model = sm.OLS(dependent_variable, independent_variables).fit()

# Print the regression summary
print(model.summary())
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                Friends   R-squared:                       0.145
Model:                            OLS   Adj. R-squared:                  0.133
Method:                 Least Squares   F-statistic:                     11.99
Date:                Sun, 01 Oct 2023   Prob (F-statistic):           2.90e-19
Time:                        17:51:38   Log-Likelihood:                -5531.1
No. Observations:                 715   AIC:                         1.108e+04
Df Residuals:                     704   BIC:                         1.113e+04
Df Model:                          10                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const        782.3715     94.103      8.314      0.000     597.616     967.127
Age           -9.5473      3.719     -2.567      0.010     -16.849      -2.245
Photos         0.0559      0.028      2.008      0.045       0.001       0.111
# of Tags      0.1037      0.036      2.908      0.004       0.034       0.174
Albums         0.8066      1.604      0.503      0.615      -2.343       3.956
Posts          1.0858      0.617      1.759      0.079      -0.126       2.298
Replies       -1.1618      0.670     -1.735      0.083      -2.476       0.153
Children     -51.3321     39.267     -1.307      0.192    -128.426      25.762
Likes          0.4073      0.082      4.944      0.000       0.246       0.569
Edu          -53.9129     47.731     -1.130      0.259    -147.625      39.799
Events         1.0351      0.876      1.182      0.238      -0.684       2.754
==============================================================================
Omnibus:                      517.634   Durbin-Watson:                   1.687
Prob(Omnibus):                  0.000   Jarque-Bera (JB):             9446.379
Skew:                           3.060   Prob(JB):                         0.00
Kurtosis:                      19.722   Cond. No.                     7.31e+03
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The condition number is large, 7.31e+03. This might indicate that there are
strong multicollinearity or other numerical problems.
In [ ]:
 
In [ ]:
 
In [ ]: