This video explains why we use the sigmoid function in neural networks for machine learning, especially for binary classification. We consider both the practical side of making sure we get a consistent gradient from the standard categorical loss function, as well as making sure the equation is easily computable. We also look at the statistical side by giving an interpretation for what the logit values represent (the values passed into the sigmoid function), and how they can be thought of as normally distributed values with their means shifted one way or the other depending on which class they are for. My other video, "Derivative of Sigmoid and Softmax Explained Visually": 📼 https://youtu.be/gRr2Q97XS2g The Desmos graph of the sigmoid function: 📈https://www.desmos.com/calculator/hjc4peyxmc Connect with me: 🐦 Twitter - https://twitter.com/elliotwaite 📷 Instagram - https://www.instagram.com/elliotwaite 👱 Facebook - https://www.facebook.com/elliotwaite 💼 LinkedIn - https://www.linkedin.com/in/elliotwaite Join our Discord community: 💬 https://discord.gg/cdQhRgw 🎵 Kazukii - Return → https://soundcloud.com/ohthatkazuki → https://open.spotify.com/artist/5d07MpiIaNmmEMTq79KAga → https://www.youtube.com/user/OfficialKazuki