Kongsberg Maritime is pioneering autonomous ships and using artificial intelligence (AI) to support crews with navigation at sea. The company’s AI solutions have traditionally been based on GPUs, but Kongsberg Maritime would prefer to use CPUs so it can simplify and consolidate its servers more easily. Working with Intel, Kongsberg Maritime was able to increase its solution’s performance on standard server hardware by 4.8x2 on one of the company’s demonstrator projects.
- Optimize AI performance, so that Kongsberg Maritime’s object recognition solution for marine navigation can process more images per second.
- Enable server consolidation and redundancy by meeting Kongsberg Maritime’s performance expectations on standard servers.
- The Intel® Distribution of OpenVINO™ toolkit was used to accelerate the performance of Kongsberg Maritime’s TensorFlow* model running on the Intel® Xeon® Platinum processor.
- The Intel® Distribution for Python* was used to share work across 64 threads on the two-socket server
- Intel’s expert team optimized the server settings and modified OpenVINO toolkit to enhance the performance.
- Image throughput was increased by 4.8x1 compared to the unoptimized baseline.
- Kongsberg Maritime can look at using standard server hardware for its marine navigation solution, increasing redundancy in the architecture and smoothing the path to marine certification.
Achieving Fast AI Inference
Over the last 10 years, 1,129 ships have been lost at sea.3 Congested seas can be a significant risk factor in some regions, and human error accounts for three-quarters of all shipping insurance losses, totaling $1.6 billion3 between 2011 and 2016.
Kongsberg Maritime has a vision to improve safety and increase the efficiency of shipping. The company plans to use AI to guide sailors on board, enable remote control from the shore, and ultimately to steer autonomous ocean-going vessels.
By 2025, the company plans to enable a short sea vessel, driven using remote and autonomous controls. Beyond that, international regulation will be the biggest barrier to launching on the open seas.
Kongsberg Maritime has already demonstrated a fully autonomous car ferry, operating in Finnish waters.4 In this demonstration, with 80 VIPs on board, Kongsberg Maritime technologies were used to navigate autonomously on the outbound journey, using sensors and cameras to detect and avoid objects. The ship berthed automatically, and remote control was used to steer the return journey.
The first step towards enabling fully autonomous vehicles, and the first commercially available product from Kongsberg Maritime for this, is called Intelligent Awareness.* It uses radar for long-distance object detection, lidar for a highly accurate analysis of the area nearer the ship, and high-definition cameras to capture a 180-degree view of the sea in front of the ship. The ship’s crew can use dashboards to see the waters around the ship, with the solution highlighting any potential hazards. The solution helps to mitigate against navigator risk, especially in the dark or in adverse weather conditions, or when carrying out tricky maneuvers such as in congested waters or when docking and undocking.
The solution currently uses GPUs for the real-time artificial intelligence analysis, which is known as inference. “We would prefer to get rid of those GPUs,” said Jaakko Saarela, project manager at Kongsberg Maritime. “One important reason is marine certification. It is much easier for us to get our servers certified if we do not use GPUs. Also, we would like to reduce our power consumption. It would be ideal if we could use generic server systems, which are all similar, too. We don’t need GPUs in all the servers, so it would be better if no servers used GPUs so that we have redundancy and can run any application on any server.”
The solution is based on about 10 server-class computers that run different parts of the application, with a high-speed internal network between the components. Kongsberg Maritime would now like to consolidate servers, so it has been investigating how the image processing can be carried out using CPUs instead of GPUs. “The neural network inference is the most challenging part,” said Saarela.
The challenge was to optimize the CPU-based solution so that it would be fast enough to detect potentially fast-moving objects at sea, such as motor boats passing across the bow of the ship.
The Intelligent Awareness solution uses TensorFlow*, a popular open source machine learning framework. Kongsberg Maritime has chosen to use a region-based fully convolutional network (R-FCN) model for object recognition, with ResNet-101* used for image classification in the back end. “We tried several architectures, and found that R-FCN provides a good trade-off between the computational performance (speed) and the inference accuracy,” said Saarela. “The big challenge is the scaling variance. The same objects can appear at different sizes, from 10 pixels square to 100,000 pixels square, depending on their distance.”
Intel worked with Kongsberg Maritime on optimizing the solution, with Kongsberg Maritime providing a pretrained AI model for Intel to use. The Intel Distribution of OpenVINO toolkit helped achieve higher throughput, without sacrificing accuracy. OpenVINO toolkit converts a trained model into an intermediate representation (IR), removing any operations that are only relevant to training and fusing together some of the inference operations so they can be computed more quickly. That intermediate representation is then processed by the OpenVINO inference engine, which returns information about identified objects to the Kongsberg Maritime application, as shown in Figure 1. Modifications were made to OpenVINO R4, which have now been incorporated into OpenVINO R5.
The solution is based on two Intel® Xeon® Platinum 8153 processors with 16 cores each. Each core can process two threads, so a total of 64 models can be processed in parallel (2 processors x 16 cores x 2 threads). To distribute the work across the threads, Intel used the mpi4py* library, which is included in the Intel® Distribution for Python*, and is more usually used for distributing work across separate servers.
To further increase the performance, the Intel team made small modifications to the default processor settings, modifying hyperparameters, including to pin threads to specific cores.
Figure 1. High-level inference procedure using OpenVINO™ toolkit. Tasks performed in a deep learning framework are depicted in light blue, OpenVINO toolkit tasks are in blue, and the user application is in orange.
Intel Enables Transformation
The optimizations were carried out by the Artificial Intelligence Products Group at Intel. The group includes data scientists who work with customers to help them to create effective AI solutions based on Intel® technologies. Intel also provided hardware to enable Kongsberg Maritime to test the solution.
“The Intel team has all the expertise in how to optimize solutions for the Intel® Xeon® platform, and the team is easy to work with,” said Saarela. “I have been working with Intel people from many departments for two years now, and I’ve been really impressed with how professional and proactive they are. They offer us so many possibilities with new tools, and new ways to do things. We have been working with TensorFlow a lot, but the resources usually assume that you will be using GPUs. Working with Intel has enabled us to optimize our solution for CPUs, so we can benefit from using a more standardized server platform.”
Following the optimization process, throughput (measured in frames per second) was increased by 4.8x2 on one of Kongsberg Maritime’s demonstrator projects.
To show that the improvements generalize beyond the R-FCN topology, Intel also tested the optimizations using the single shot multibox detector (SSD) topology, which is typically less accurate than R-FCN. Throughput was increased by 4.5x2 when the optimized platform was compared against the unoptimized platform. Using the Intel Distribution of OpenVINO toolkit alone increased performance by 2.4x.2
“I’m impressed with the results,” said Saarela. “I had assumed we would always need GPUs, but this has changed my mind about what is possible using CPUs.”
Figure 2. Using the Intel® Distribution of OpenVINO™ toolkit and multithreading on an optimized platform, performance increased by 4.8x1, measured in frame per second (FPS).
Technical Components of the Solution
- Intel Xeon Platinum processor. Intel Xeon Platinum processors are the foundation for secure, agile, hybrid-cloud data centers. With exceptional multi-socket processing performance, these processors are built for mission-critical, real-time analytics, machine learning, artificial intelligence, and multi-cloud workloads. With trusted, hardware-enhanced data service delivery, this processor family delivers monumental leaps in I/O, memory, storage, and network technologies to harness actionable insights from our increasingly data-fueled world.
- Intel Distribution of OpenVINO toolkit. Based on convolutional neural networks (CNN), the toolkit extends workloads across Intel® hardware (including accelerators) and maximizes performance. It helps developers to create solutions that emulate human vision.
- Intel Distribution for Python. Using Intel Distribution for Python, you can achieve faster Python application performance with minimal code changes; accelerate the NumPy*, SciPy* and scikitlearn* libraries; and access the latest vectorization and multithreading instructions.