Network Optimization and AI Inferencing Management for Telepathology Reference Implementation

Version : 3.0.0   Publié 03/02/2021  

Date de la dernière mise à jour 10/20/2021

Overview

The Network Optimization and AI Inferencing Management for Telepathology Reference Implementation enables digital pathology through lab analysis automation. It showcases how an optimized cloud-native software architecture can simplify and automate networking challenges and optimize AI model deployment and management for digital pathology within a hospital system.  

Although there are the benefits of Telepathology, the following unique medical community challenges have to be resolved:  

  • Efficient and accurate data management for sharing with the hospital IT infrastructures or with another hospital/medical facilities.  
  • Very large files for medical images (can reach 80GB uncompressed).  
  • An ecosystem extending from the Whole Slide Imaging (WSI) equipment on the edge to on-premise and cloud server platforms. 
  • A multi-access network (i.e., wired, Wi-Fi, 4G/5G, etc.), where logic is needed to properly route the data and security is of the utmost importance and cannot be violated. 

This Reference Implementation (RI) provides solutions to some of those unique challenges through: 

  • Automated network abstraction, which helps avoid complex data routing and traffic shaping and gives confidence in efficient data sharing and AI model utilization.   
  • Reduced ‘hands-on’ management for data routing as well as AI model optimization within the IT infrastructure.

Select Configure & Download to download the reference implementation and the software listed below.  

Configure & Download


Screenshot of running the reference implementation

 

 

 

 

 

 

 

 


Time to Complete

Programming
Language

Available Software

60 - 90 minutes

Python*


OpenNESS version 21.03 

Google Cloud SDK 318.0.0 (For Google Cloud Storage Option)

OpenVINO™ Model Server (OVMS) version 2021.2


Target System Requirements 

Controller and Edge Node Systems 

  • One of the following processors: 
    • Intel® Xeon® scalable processor. 
  • At least 128 GB RAM. 
  • At least 256 GB hard drive. 
  • Intel® Ethernet Converged Network Adapter X710-DA4
  • CentOS* 7.9.2009.  
  • An Internet connection.

Client Systems

  • One of the following processors: 
    • Intel® Xeon® processor. 
  • At least 8 GB RAM. 
  • At least 256 GB hard drive. 
  • Intel® Ethernet Converged Network Adapter X710-DA4
  • Ubuntu* 18.04
  • An Internet connection. 

How It Works

Component Overview 

The Network Optimization and AI Inferencing Management for Telepathology RI includes the OpenVINO™ Model Server (OVMS) along with OpenNESS software.  

Open Network Edge Services Software (OpenNESS) Toolkit 

The Open Network Edge Services Software (OpenNESS) toolkit enables developers to port existing applications running in the cloud, provides components to build platform software, the ability to build and deploy E2E edge services in the field as well as perform benchmarking for diverse edge deployment scenarios (on-premise and access edge). Learn more.  

OpenVINO™ Model Server (OVMS) 

OpenVINO™ Model Server (OVMS) is a scalable, high-performance solution for serving machine learning models optimized for Intel® architectures. The server provides an inference service via gRPC or REST API - making it easy to deploy new algorithms and AI experiments using the same architecture as TensorFlow* Serving for any models trained in a framework that is supported by OpenVINO™.  

The server implements gRPC and REST API framework with data serialization and deserialization using TensorFlow Serving API, and OpenVINO™ as the inference execution provider. Model repositories may reside on a locally accessible file system (for example, NFS), Google Cloud Storage* (GCS), Amazon S3*, MinIO*, or Azure Blob Storage*. Learn more.  

Grafana 

Data visualization is done on Grafana and allows you to view the cluster monitoring dashboard. System usage specific metrics can be seen such as network I/O, Total CPU usage, Pods CPU usage, System servicing CPU usage, Containers CPU usage and all Processors CPU usage.

Architecture Diagram
Figure 1. Architecture Diagram

 

The Client example here is a machine with medical images on which image analytics needs to be performed. It continuously sends RPC calls to OVMS, to which OVMS performs inference using the underlined hardware and then sends back the result to the client. This obtained result is then pushed to InfluxDB and fetched by Grafana for Visualization. The generated labelled image is given in parallel to Flask server, which is integrated with Grafana. Prometheus sends to Grafana pod metrics like memory usage, CPU usage, etc. For manual execution and tinkering, follow the detailed startup steps mentioned below.


Get Started 

Prerequisites 

Make sure that the following conditions are met properly to ensure a smooth installation process. 

  1. Hardware Requirements 
    Make sure you have a fresh CentOS 7.9.2009 installation with the Hardware specified in the Target System Requirements section. 
     
  2. Proxy Settings 
    If you are behind a proxy network, please ensure that proxy addresses are configured in the system. 
    export http_proxy=<proxy-address>:<proxy-port> 
    export https_proxy=<proxy-address>:<proxy-port>
  3. Date & Time  
    Make sure that the Date & Time are in sync with current local time.
     
  4. IP Address Conflict 
    Make sure that the Edge Controller IP is not conflicting with OpenNESS reserved IPs. For more details, please refer to IP address range allocation for various CNIs and interfaces in the Troubleshooting section.
     
  5. Login as root user.

    su – root

     

  6. For non-root user installation, create new user openness with password as openness

    useradd openness 
    passwd openness  
    Changing password for user openness.  
    New password: < have to provide password as “openness” > BAD PASSWORD: The password contains the user name in some form  
    Retype new password: < have to provide password as “openness” > passwd: all authentication tokens updated successfully. 
  1. Provide sudoers permission to openness user. 

    echo "openness ALL=(ALL) NOPASSWD:ALL" | sudo tee /etc/sudoers.d/openness

     

  2. Generate ssh key if not created. 

    ssh-keygen -t rsa -N '' -f ~/.ssh/id_rsa

     

  3.  Copy ssh key to openness user.

    ssh-copy-id -i /root/.ssh/id_rsa.pub openness@<controller ip address> 

     

  4. Enter yes when it prompts you to continue connecting. Enter user password at password prompt. 

  5. Update hostname if it is localhost. This must be done for all target machines. 

    hostnamectl set-hostname <newhostname> 
    And update /etc/hosts file with the new hostname as:
    127.0.0.1 <newhostname>  
    ::1 <newhostname> 

     

  6. Install basic libraries/packages during setup if not already done. The list of basic libraries/packages is as follows: pciutils, usbutils, python-devel, python3-devel, wget and unzip.  

    yum install -y \<libraries/packages\>

     

Step 1: Install Google Cloud SDK*  

Note: This step is valid when you choose Google Cloud Storage option. If you have chosen Local Storage option, skip this step.

Follow the steps below from the controller device to install the Google Cloud SDK*.  

Note You should have a Google Cloud platform account to complete the installation and utilize the RI. 

1. Download the Google Cloud SDK package for Linux* using the following command: 

curl -O https://dl.google.com/dl/cloudsdk/channels/rapid/downloads/google-cloud-sdk-318.0.0-linux-x86_64.tar.gz

2. Extract the downloaded package and install the Google Cloud SDK from the extracted directory using the following commands: 

tar -xf google-cloud-sdk-318.0.0-linux-x86_64.tar.gz 
./google-cloud-sdk/install.sh 
./google-cloud-sdk/bin/gcloud init 

3. Enter the account details and configure the cloud project when prompted. 

Screenshot of Google Cloud Project SDK
Figure 2. Google Cloud Project SDK

 

Note Restart the terminal after initializing the Google Cloud. 

Step 2: Install the Reference Implementation  

Note Before installing the Reference Implementation, make sure that the Ubuntu* 18.04 Client system is on the same network as the Server, so that they are able to communicate. 

Note On Ubuntu 18.04 Client system, follow the steps below from the Client system to be able to ssh as root user: 

  • Open the file /etc/ssh/sshd_config and go to the following line: 
PermitRootLogin without-password
  •  Change the mentioned line to (Uncomment the line if it is commented): 
PermitRootLogin yes 
  • Save the file and then restart the SSH server, using the following command: 
sudo service ssh restart
  •  Set sudo password, using the following command:  
sudo passwd 

Select Configure & Download to download the reference implementation and then follow the steps below from the Controller to install it. 

Configure & Download

1. Make sure that the Target System Requirements are met properly before proceeding further.   

  • For single-device mode, only one machine is needed. (Both Controller and edge node will be on same device.)  
  • For multi-device mode, make sure you have at least two machines (one for Controller and other for Edge Node).  

Note Multi-device mode is not currently supported for this release. 

2. Open a new terminal as an openness user,and and move the downloaded zip package to /home/openness folder.  

mv <path-of-downloaded-directory>/network_optimization_and_ai_inferencing_management_for_telepathology.zip /home/openness 

3. Go to /home/openness directory using the following command and unzip the RI.   

cd /home/openness  
unzip network_optimization_and_ai_inferencing_management_for_telepathology.zip 

4. Go to network_optimization_and_ai_inferencing_management_for_telepathology/ directory.  

cd network_optimization_and_ai_inferencing_management_for_telepathology 

5. Change permission of the executable edgesoftware file.  

chmod 755 edgesoftware 

 6. Run the command below to install the Reference Implementation:  

./edgesoftware install

7. During the installation, you will be prompted for the Product Key. The Product Key is contained in the email you received from Intel confirming your download.

Note Installation logs are available at path: /var/log/esb-cli/Network_Optimization_and_AI_Inferencing_Management_for_Telepathology//install.log

Screenshot of product key
Figure 3. Product Key

8. During the installation, you will be prompted to configure few things before installing OpenNESS. Refer the screenshot below to configure. 

Note Multi-Device is not currently supported. Select Single Device when prompted to Select the type of installation.

Screenshot of OpenNESS configuration
Figure 4. OpenNESS Configuration

 

Note If you are using a Microsoft Azure* instance, please enter the Private IP address as the IP address of Controller. 

9. During the installation, you will be prompted for the IP address of the controller and client. Enter the correct IP addresses.

Screenshot of IP Address of Client and Controller
Figure 5. IP Address of Client and Controller

 

10. When the installation is complete, you see the message Installation of package complete and the installation status for each module.  

Screenshot of install success
Figure 6. Installation Status

 

11. If OpenNESS is installed, running the following command should show output similar to the image below. All the pods should be either in running or completed stage.  

kubectl get pods -A 
Screenshot of Status of Pods
Figure 7. Status of Pods

Step 3: Copy the Model Files to Google Cloud Storage 

Note: This step is needed when you choose Google Cloud Storage option. If you have chosen Local Storage option, skip this step and follow Step 4: Copy the Model Files to Local Storage.

1. Navigate to the node directory.  

cd /home/openness/network_optimization_and_ai_inferencing_management_for_telepathology/Network_Optimization_and_AI_Inferencing_Management_for_Telepathology_1.0.0/Network_Optimization_and_AI_Inferencing_Management_for_Telepathology/TelePathology/node/

2. Create a Cloud Storage bucket using the Guide.  

3. Provide required Permissions to the storage bucket using the following steps: Click on the storage bucket to view the Bucket details, then click on Permissions and select Add

Screenshot of Required Permissions
Figure 8. Required Permissions

 

  • Select allUsers on the New Members section. 
  • Select Storage Legacy Bucket Owner on Role section. 
  • Click on ADD ANOTHER ROLE.
  • Select Storage Object Viewer on the next Role section. 
  • Click on SAVE.
Screenshot of Add Roles
Figure 9. Add Roles
  • Click on ALLOW PUBLIC ACCESS when prompted.
Screenshot of Allow Public Access
Figure 10. Allow Public Access

 

4. Upload the models/ directory to Google Cloud storage bucket from an Internet browser. Alternatively, to upload the models/ directory to Google Cloud storage bucket from the terminal, use the following command: 

gsutil cp -r models/ gs://<your_google_cloud_storage_bucket_name>/ 

Step 4: Copy the Model Files to Local Storage 

Note: This step is valid when you have chosen Local Storage option. If you have chosen Google Cloud Storage option, skip this step. 

1. Navigate to the /opt directory.  

cd /opt

 2. Create models directory.  

sudo mkdir models

3. Copy the model files to the created models directory.  

sudo cp -r /home/openness/network_optimization_and_ai_inferencing_management_for_telepathology/Network_Optimization_and_AI_Inferencing_Management_for_Telepathology_1.0.0/Network_Optimization_and_AI_Inferencing_Management_for_Telepathology/TelePathology/node/models/1 /opt/models

4. Change user mode and group mode for /opt/models folder recursively.  

sudo chown -R 5000:5000 models

 

Step 5: Start the OpenVINO™ Model Server (OVMS) 

1. Navigate to the deploy directory from the terminal. 

cd deploy/

2. Run the commands below. 

kubectl create namespace monitoring 
Kubectl create namespace inferencing 
## Workaround for Google Cloud Storage option 
sed -i '101 s/.Values.model_path/(and .Values.models_host_path .Values.model_path)/' ./ovms/templates/deployment.yaml 
sed -i '114 s/.Values.model_path/and .Values.models_host_path .Values.model_path/' ./ovms/templates/deployment.yaml

 3. Run the command below to start the OVMS. 

helm install --set model_name=stardist-0001,model_path=gs://<bucket-name>/models/,http_proxy=<http-proxy>,https_proxy=<https-proxy>  ovms ./ovms

4. If you chose Local Storage option, then the following command is to be triggered for starting the OVMS: 

helm install --set model_name=stardist-0001,models_host_path=/opt/models,model_path=/models,http_proxy=<http-proxy>,https_proxy=<https-proxy> ovms ./ovms

5. Use the following command to start the server. 

helm install server ./server --set hostIP=<controller-IP> 

6. Run the commands below from the same directory to deploy Grafana, InfluxDB and Prometheus containers. 

helm install grafana ./grafana --set hostIP=<controller-IP> 
helm install influxdb  ./influxdb 
helm install prometheus  ./prometheus 

7. Run the command below to make sure the status of the ovms pod is Running.

kubectl get pods -n inferencing

8. Run the command below to observe the logs. 

kubectl logs -f -n inferencing <ovms_pod_name>
Screenshot of Logs of OVMS Pod
Figure 11. Logs of OVMS Pod

 

Note On the logs from OVMS pod, verify that the INFO of models is similar to the highlighted section on the above screenshot. The number of versions of the model should be 1. 


Run the Application 

Note The following steps to run the application should be executed from a Client system. The Client should be a different system but on the same network as the Server, so that they are able to communicate. 

Once the installation is completed on the server, navigate to the following directory from the client system. 

# su 
# cd /root/TelePathology/client/

Activate the client, using the following command. 

# source tele_pathology/bin/activate 

Start the client using the following command. 

# chmod +x run.sh 
# ./run.sh <IP_address_of_Controller>
Screenshot of Run the Application
Figure 12. Run the Application

 

Once the server is started, sample input images from images/ directory on client are sent to the server system for inference and processing. The results are sent back to the client and can be visualized on a Dashboard on the client.


Data Visualization on Grafana 

1. Navigate to :30800 on your browser. 

2. Login with user as admin and password as admin.  

3. Click on Home

4. Select Telepathology Inference Metrics.

Screenshot of Inference Metrics on Grafana Dashboard
Figure 13. Inference Metrics on Grafana Dashboard

5. To view the cluster monitoring dashboard, click on the Dashboard name from the top-left corner of Grafana and select Kubernetes cluster monitoring dashboard.

Screenshot of  Choosing the Grafana Dashboard
Figure 14. Choosing the Grafana Dashboard

6. The dashboard with the system usage specific metrics can be visualized.

Screenshot of Kubernetes Cluster Monitoring on Grafana Dashboard
Figure 15. Kubernetes Cluster Monitoring on Grafana Dashboard

You can scroll down the dashboard and view specific metrics like Pods CPU usage, and Containers CPU usage as shown below. 

Screenshot of Kubernetes Cluster Monitoring on Grafana Dashboard (scroll down) 
Figure 16. Kubernetes Cluster Monitoring on Grafana Dashboard (scroll down) 

Stop the Application

To remove the deployment of this reference implementation, run the following commands. 

Note This will remove all the running pods and the data and configuration stored in the device. 

helm delete server 
helm delete grafana 
helm delete influxdb 
helm delete prometheus
Helm delete ovms

Uninstall the Application 

1. Go to network_optimization_and_ai_inferencing_management_for_telepathology/ directory.

cd /home/openness/network_optimization_and_ai_inferencing_management_for_telepathology

2. Run the command below to uninstall the Reference Implementation:  

./edgesoftware uninstall -a

 3. When the uninstallation is complete, you see the message Uninstallation of package complete and the uninstallation status for each module. 

Screen showing uninstall of package
Figure 17: Uninstallation Status

 


Summary and Next Steps

This reference implementation highlights flexibility, modularity and ease of deployment for on-premise and at the network edge through the OpenNESS application. Coupling this application with the OpenVINO™ Model Server’s scalability for AI model deployment and management provides the needed software components to enable a telepathology service.   

As a next step, you can experiment with accuracy and throughput by using other pathology datasets.   

To understand more about OpenNESS architecture, building blocks and implementation types, we recommend this GitHub page.  


Learn More 

To continue learning, see the following guides and software resources: 


Troubleshooting 

Pods Status Check 

Verify that the pods are Ready as well as in Running state using below command: 

kubectl get pods -A

 If they are in ImagePullBackOff state, manually pull the images using: 

Screenshot of error

docker login 
docker pull <image-name>

If any pods are not in Running state, use the following command: 

kubectl describe -n <namespace> pod <pod_name>

Docker Pull Limit Issue 

If docker pull limit error is observed, Login with your docker premium account. 

If Harbor Pods are not in Running state, please login using the below command: 

docker login

If Harbor Pods are in Running state, please login using the below commands: 

docker login 
docker login https://<Machine_IP>:30003  
<Username – admin> 
<Passsword - Harbor12345> 

Installation Failure 

If the OpenNESS installation has failed on pulling the OpenNESS namespace pods like Grafana, Telemetry, TAS, etc., please reboot the system and after reboot, execute the following command: 

reboot 
su  
swapoff -a  
systemctl restart kubelet (Wait till all pods are in “Running” state.) 
./edgesoftware install 

Pod Status Shows “ContainerCreating” for Long Time 

If Pod status shows ContainerCreating or Error or CrashLoopBackOff for a while (5 minutes or more), run the following commands: 

reboot 
su  
swapoff -a  
systemctl restart kubelet (Wait till all pods are in “Running” state.) 
./edgesoftware install 

Subprocess:32 Issue 

If you see any error related to subprocess, run the command below: 

pip install --ignore-installed subprocess32==3.5.4 

ImportError 

While using the run.sh from client and observe the ImportError, run the following commands:

apt-get install ffmpeg libsm6 libxext6 -y

Screenshot of ImportError

IP Address Range Allocation for Various CNIs and Interfaces 

The OpenNESS Experience kits deployment uses/allocates/reserves a set of IP address ranges for different CNIs and interfaces. The server or host IP address should not conflict with the default address allocation. In case if there is a critical need for the server IP address used by the OpenNESS default deployment, it would require modifying the default addresses used by the OpenNESS. 

The following files specify the CIDR for CNIs and interfaces. These are the IP address ranges allocated and used by default just for reference. 

flavors/media-analytics-vca/all.yml:19:vca_cidr: "172.32.1.0/12" 
group_vars/all/10-default.yml:90:calico_cidr: "10.243.0.0/16" 
group_vars/all/10-default.yml:93:flannel_cidr: "10.244.0.0/16" 
group_vars/all/10-default.yml:96:weavenet_cidr: "10.32.0.0/12" 
group_vars/all/10-default.yml:99:kubeovn_cidr: "10.16.0.0/16,100.64.0.0/16,10.96.0.0/12" 
roles/kubernetes/cni/kubeovn/controlplane/templates/crd_local.yml.j2:13:  cidrBlock: "192.168.{{ loop.index0 + 1 }}.0/24"

The 192.168.x.y is used for SR-IOV and interface service IP address allocation in the Kube-ovn CNI. The server IP address must not fall within this range or it will conflict and cause erratic behavior. Completely avoid the 192.168.0.0/16 address range for the server IP address.

If the server/host IP address is required to use 192.168.x.y while this range by default is used for SRIOV interfaces in OpenNESS, then the IP address range for the cidrBlock in the roles/kubernetes/cni/kubeovn/controlplane/templates/crd_local.yml.j2 file must be changed to something like 192.167.{{ loop.index0 + 1 }}.0/24 (or other non-conflicting address range) to reconfigure the IP segment used for SR-IOV interfaces.

Support Forum 

If you're unable to resolve your issues, contact the Support Forum.  

To attach the installation logs with your issue, execute the command below to consolidate a list of the log files in tar.gz compressed format, e.g., network_optimization_and_ai_inferencing_management_for_telepathology.tar.gz.  

tar -czvf network_optimization_and_ai_inferencing_management_for_telepathology.tar.gz /var/log/esb-cli/Network_Optimization_and_AI_Inferencing_Management_for_Telepathology_3.0.0/Network_Optimization_and_AI_Inferencing_Management_for_Telepathology/ /var/log/esb-cli/Network_Optimization_and_AI_Inferencing_Management_for_Telepathology_3.0.0/OpenNESS/

 


Citations

*github.com/mpicbg-csbd/stardist
@inproceedings{schmidt2018,
  author    = {Uwe Schmidt and Martin Weigert and Coleman Broaddus and Gene Myers},
  title     = {Cell Detection with Star-Convex Polygons},
  booktitle = {Medical Image Computing and Computer Assisted Intervention - {MICCAI} 
  2018 - 21st International Conference, Granada, Spain, September 16-20, 2018, Proceedings, Part {II}},
  pages     = {265--273},
  year      = {2018},
  doi       = {10.1007/978-3-030-00934-2_30}
}

@inproceedings{weigert2020,
  author    = {Martin Weigert and Uwe Schmidt and Robert Haase and Ko Sugawara and Gene Myers},
  title     = {Star-convex Polyhedra for 3D Object Detection and Segmentation in Microscopy},
  booktitle = {The IEEE Winter Conference on Applications of Computer Vision (WACV)},
  month     = {March},
  year      = {2020},
  doi       = {10.1109/WACV45572.2020.9093435}
}
**bbbc.broadinstitute.org/BBBC038
"We used image set BBBC038v1, available from the Broad Bioimage Benchmark Collection [Caicedo et al., Nature Methods, 2019]."

Infos sur le produit et ses performances

1

Ces performances varient en fonction de l'utilisation, de la configuration et d'autres facteurs. Pour en savoir plus, veuillez consulter www.intel.com/PerformanceIndex.