Welcome to How-To Document Data-Science Projects’s documentation!

How-To Document Data Science Projects with Microsoft Azure

This project is part of a blog article published at Medium. We will give you a step by step guide on documenting your Python data science project effectively as a part of machine learning model development. The solution we propose ensures that your documentation is version controlled, made available to your users or co-workers and shipped with the source code executing your machine learning experiment. We will use generally available tools including sphinx, GitHub, Microsoft Azure DevOps, and Azure storage or Azure web Services.

Problem Statement

The problem statement of the Modified National Institute of Standards and Technology database (MNIST) dataset is a large database of handwritten digits. It is widely used for training and testing in the field of machine learning. It is a classification problem with 10 classes that aims to train a machine learning model to correctly classify handwritten digits. The MNIST classification problem has developed into a popular example for introductory machine learning tutorials. An extensive list of successfully trained machine learning models can be found at the MNIST website of Yann LeCunn.

The Data Set

The MNIST dataset was created by combining samples from MINST’s original datasets.


These example is developed using the anaconda Python environment.

The requirements:

tensorflow numpy

Indices and tables