Cloud_platforms

Some concepts and practices about cloud ml providers

There are a number of machine learning cloud platforms, we provide more details about a few below. In the next few lessons, you will learn how to use Amazon's SageMaker to deploy machine learning models. Therefore, we focused on providing more information on Amazon's SageMaker. To allow for a comparison of features offered by SageMaker, we also provide detailed information about Google's ML Engine because it's most similar to SageMaker.

Amazon Web Services (AWS)

Amazon Web Services (AWS) SageMaker is Amazon's cloud service that allows you to build, train, and deploy machine learning models. Some advantages to using Amazon's SageMaker service are the following:

  • Flexibility in Machine Learning Software: SageMaker has the flexibility to enable the use of any programming language or software framework for building, training, and deploying machine learning models in AWS. For the details see the three methods of modeling within SageMaker below

    • Built-in Algorithms - There are at least fifteen built-in algorithms that are easily used within SageMaker. Specifically, built-in algorithms for discrete classification or quantitative analysis using linear learner or XGBoost, item recommendations using factorization machine, grouping based upon attributes using K-Means, an algorithm for image classification, and many other algorithms.

    • Custom Algorithms - There are different programming languages and software frameworks that can be used to develop custom algorithms which include: PyTorch, TensorFlow, Apache MXNet, Apache Spark, and Chainer.

    • Your Own Algorithms - Regardless of the programming language or software framework, you can use your own algorithm when it isn't included within the built-in or custom algorithms above.

  • Ability to Explore and Process Data within SageMaker: SageMaker enables the use of Jupyter Notebooks to explore and process data, along with creation, training, validation, testing, and deployment of machine learning models. This notebook interface makes data exploration and documentation easier.

  • Flexibility in Modeling and Deployment: SageMaker provides a number of features and automated tools that make modeling and deployment easier. For the details on these features within SageMaker see below:

    • Automatic Model Tuning: SageMaker provides a feature that allows hyperparameter tuning to find the best version of the model for built-in and custom algorithms. For built-in algorithms SageMaker also provides evaluation metrics to evaluate the performance of your models.

    • Monitoring Models: SageMaker provides features that allow you to monitor your deployed models. Additionally with model deployment, one can choose how much traffic to route to each deployed model (model variant). More information on routing traffic to model variants can be found here and here .

    • Type of Predictions: SageMaker by default allows for On-demand type of predictions where each prediction request can contain one to many requests. SageMaker also allows for Batch predictions, and request data size limits are based upon S3 object size limits.

Google Cloud Platform (GCP)

Google Cloud Platform (GCP) ML Engine is Google's cloud service that allows you to build, train, and deploy machine learning models. Below we have highlighted some of the **similarities** and **differences** between these two cloud service platforms.

  • Prediction Costs: The **primary difference** between the two is how they handle predictions. With **SageMaker** predictions, you must leave resources running to provide predictions. This enables less latency in providing predictions at the cost of paying for running idle services, if there are no (or few) prediction requests made while services are running. With **ML Engine** predictions, one has the option to not leave resources running which reduces cost associated with infrequent or periodic requests. Using this has more latency associated with predictions because the resources are in a offline state until they receive a prediction request. The increased latency is associated to bringing resources back online, but one only pays for the time the resources are in use. To see more about ML Engine pricing and SageMaker pricing.

  • Ability to Explore and Process Data: Another **difference** between ML Engine and SageMaker is the fact that Jupyter Notebooks are not available within ML Engine. To use Jupyter Notebooks within Google's Cloud Platform (GCP), one would use Datalab. GCP separates data exploration, processing, and transformation into other services. Specifically, Google's Datalab can be used for data exploration and data processing, Dataprep can be used to explore and transform raw data into clean data for analysis and processing, and DataFlow can be used to deploy batch and streaming data processing pipelines. Noting that Amazon Web Services (AWS), also have data processing and transformation pipeline services like AWS Glue and AWS Data Pipeline.

  • Machine Learning Software: The final **difference** is that Google's ML Engine has less flexibility in available software frameworks for building, training, and deploying machine learning models in GCP as compared to Amazon's SageMaker. For the details regarding the two available software frameworks for modeling within ML Engine see below.

  • Flexibility in Modeling and Deployment: Google's ML Engine provides a number of features and automated tools that make modeling and deployment easier, similar to the those provided by Amazon's SageMaker For the details on these features within ML Engine see below:

Last updated