Binary Logistic Regression Tutoring

Today, with the growing importance of big data, machine learning, data analytics, etc., most business school students encounter logistic regression in MBA programs. As tutors to MBA students, we provide private tutoring online for binary logistic regression. Key concepts to understand include the logit model used, the difference between probability and odds, Maximum Likelihood Estimation (MLE) technique, interpreting the coefficients of binary logistic regression and the assumptions that drive binary logistic regression.

Logistic regression is a short form of binary logistic regression. But what is binary logistic regression? Well let us look at what the name tells us.

Regression. Clearly as the name indicates binary logistic regression is a type of regression. You know by now that regression is a statistical process used to define or quantify the relationship between a dependent variable and one or more independent variables. Since we have regression in the name – binary logistic regression, clearly we must be doing the same! – binary logistic regression will define or quantify the relationship between a dependent variable and one or more independent variables.

Logistic: The word logistic is derived from the word logit. The word logit was Joseph Berkson’s invention – an abbreviation for logistic unit. Logit in statistics simply means log-odds. And odds is another name for probability. Log odds is simply the logarithm of the odds. If probability is denoted by p, the odds will be p/(1-p) and therefore the logarithm of the odds will be log (p/(1-p).

Binary: This process is called binary logistic regression because the dependent variable in our regression model is binary. Examples include pass/fail, accept/reject, approve/deny, yes/no, etc.

Often, understanding the name gives you a better understanding of the concept. Would you agree? We will look at the assumptions underlying binary logistic regression and walk you through a few examples. This should be sufficient for you to get a good handle on binary logistic regression!

Difference between Probability and Odds.

To understand and use logistic regression, you need to understand the difference between probability and odds. Probability and odds are interrelated but not the same.

The probability of an event is the number of times that event occurred in a data set divided by the total events in a data set.
Odds are the probability of that event occurring divided by the probability that event not occurring.

Binary Logistic Regression vs. Classification Trees

Binary logistic regression is a classification technique like classification trees, etc. Often we are asked if binary logistic regression is better than classification trees! Can you compare an apple and an orange? Similarly both have their uses and settings. We suggest you try both binary logistic regression and classification trees on your training data set and select what works best for you.

Maximum Likelihood Estimation (MLE) Technique

We use the maximum likelihood estimation (MLE) technique in binary logistic regression. the maximum likelihood estimation (MLE) technique essentially tries to maximize the likelihood of observing the given outcomes.

How the maximum likelihood estimation (MLE) technique is applied in practice is as follows. We know that the logit function that drives binary logistic regression is log(p/(1-p). We use regression to estimate the event we are interested in. Based on the predicted event, we compute the probability, p, as exp(predicted outcome)/(1+exp(predicted outcome)). From this probability we compute the likelihood of the evening happening. The likelihood is the probability if the event has occurred. If the event did not occur, the likelihood is 1- probability of event happening. In maximum likelihood estimation (MLE) technique, we chose the parameters of the regression coefficients tries to maximizes likelihood using solver!

Interpreting the Coefficients of the Binary Logistic Regression.

The intercept and coefficients do not make sense as it. The exponent of the intercept and coefficients become useful. The exponent of a coefficient indicates the percentage/multiplicative increase of the probability/odds of success. (exp(b1)-1). The exponent of the intercept gives you the probability or odds of a success if we have zero for the variables considered.

Assumptions Of Binary Logistic Regression

The dependent variables follow a binomial distribution.
The dependent variables/cases are independent.
There is linear relationship between the log of the dependent variable and the independent variables (does not require a linear relationship between the dependent variable and the independent variables)
Model errors are independent.
Requires large samples(ideally each category has greater than 5 data points – strictly speaking less than 80% of categories must have at least 5 data points).

Binary Logistic Regression Analysis in Microsoft Excel: Step by Step

Step 1 Linear Regression: Run a linear regression using the binary variable of interest as the dependent variable.

Step 2 Predicted Value: Using the linear regression coefficients obtained in step 1 to compute the predicted value for the dependent variable. Remember that the predicted values in a linear regression is a continuous variable. It is similar to a score.

Step 3 Probability: Use the predicted values from step 2 compute the probability of a success using the formula Exp(predicted value)/(1+Exp(predicted value)).

Step 4 Estimate Likelihood: From the probabilities we computed in step 3, we estimate the likelihood of the event happening. Likelihood is equal to the probability of success (from step 3) when the data row was a success and is 1 – probability of success when the data row was NOT a success. We use the ‘if’ or ‘if else’ function. We say if the data row has the outcome, we give it the probability computed above and if not, we give it the compliment (or 1- probability computed above). We are trying to maximize likelihood. This Step 4 should have been enough. But the total product of the likelihood would be an exceedingly small number. So, we move on to step 5.

Step 5 Calculate Log-Likelihood: we take the log of each data point’s likelihood. LN(Step 4) for each row.

Step 6 Sum of Log-Likelihood: We arrive at the total log-Likelihood by summing up the log-Likelihood for all the rows.

Step 7 Maximize Likelihood: We maximize the log likelihood using solver. The Objective value is the Sum of loglikelihood we arrived at in step 6. The coefficients of the regression function we arrived at in step 1 are the decision variables. This will be a GRG method and we allow the decision variables to be negative. The resulting coefficients become your logistic regression coefficients. And the resultant updated score in step 2 becomes the score of each data point for your event.

Interpreting the Coefficients of Binary Logistic Regression

The exponent of a coefficient indicates the multiplicative increase of the probability/odds of success. The exponent of the intercept gives you the probability or odds of a success if we have zero for the variables considered.

Essentially, logistic regression is a technique for fitting a curve to data in which the dependent variable is binary. We provide one on one tutoring for binary logistic regression. Please let us know if we can help you understand the key concepts of binary logistic regression which include the logit model, difference between probability and odds, Maximum Likelihood Estimation (MLE) Technique, interpreting the coefficients of binary logistic regression and the assumptions that drive binary logistic regression.

Tutoring for Binary Logistic Regression or Other Regression Topics

Please call or email us if we can assist you with tutoring for binary logistic regression or any other types of regression analysis. Some quick resources are listed below:

We can also assist you with tutoring in other data analysis techniques in addition to regression analysis such as Principal Component Analysis (PCA), Discriminant Analysis, Principal Component Analysis (PCA), K-Means Clustering: Hierarchical Clustering, Density-Based Clustering, Partitional Clustering, Classification Trees, Entropy and Information Gain, etc.