Machine Vision
Assembly Line
Increase manufacturing processes by 25% with AI, Opcenter and Retrocausual a Siemens Partner
Basler AG: Innovation Leaders
How OSARO used Cognex to solve a tricky barcode reading challenge for Zenni Optical
📷 Making automated visual-inspection systems practical
Using supervised learning to train anomaly localization models has major drawbacks compared to images of defect-free products, images of defective products are scarce; and labeling defective-product images is expensive. Consequently, our benchmarking framework doesn’t require any anomalous images in the training phase. Instead, from the defect-free examples, the model learns a distribution of typical image features.
We have released our benchmark in the hope that other researchers will expand on it, to help bridge the gap between the impressive progress on anomaly localization in research and the challenges of real-world implementation.
🧠🦾 Google’s Robotic Transformer 2: More Than Meets the Eye
Google DeepMind’s Robotic Transformer 2 (RT2) is an evolution of vision language model (VLM) software. Trained on images from the web, RT2 software employs robotics datasets to manage low-level robotics control. Traditionally, VLMs have been used to combine inputs from both visual and natural language text datasets to accomplish more complex tasks. Of course, ChatGTP is at the front of this trend.
Google researchers identified a gap in how current VLMs were being applied in the robotic space. They note that current methods and approaches tend to focus on high-level robotic theory such as strategic state machine models. This leaves a void in the lower-level execution of robotic action, where the majority of control engineers execute work. Thus, Google is attempting to bring the power and benefits of VLMs down into the control engineers’ domain of programming robotics.
🛣️ America’s Bridges, Factories and Highways Are in Dire Need of Repairs. Bring in the Robots.
These days, Shell is able to keep the plant running, and keep repair personnel on the ground and at a safe distance as they operate wall-climbing robots that inspect things like steel holding tanks at millimeter resolution, says Steven Treviño, a robotics engineer at Shell. Using a variety of sensors, the robots can look for both corrosion and cracking. This helps the team shorten the list of things they have to take care of when a full shutdown occurs. The magnetic wall climbers Shell is using are made by a Pittsburgh-based startup called, appropriately, Gecko Robotics. After testing the Gecko robots at Geismar, Shell plans to expand their use to offshore facilities.
“There are hundreds of types of corrosion,” says Jake Loosararian, CEO of Gecko Robotics, “and we’ve been developing technology and software to analyze what kind of damage is happening.” Gecko began as a robotics company, but has since expanded into creating software to process the data its robots gather. The startup makes systems that are now used to track more than 60,000 assets across the globe, including power plants, pipelines, oil refineries, dams, U.S. Navy vessels and other military equipment.
When it comes to inspections, “often the data you need is literally in plain sight, it’s just hard to collect it,” says Bry, of Skydio.
AI Transformer Models Enable Machine Vision Object Detection
Machine vision is another key technology, and today AI and machine vision interact in a few ways. “First, machine vision output is fed to an AI engine to perform functions such as people counting, object recognition, etc., to make decisions,” said Arm’s Zyazin. “Second, AI is used to provide better quality images with AI-based de-noising, which then assists with decision-making. An example could be an automotive application where a combination of AI and machine vision can recognize a speed limit sign earlier and adjust the speed accordingly.”
“There are a few main directions for machine vision, including cloud computing to scale deep-learning solutions, automated ML architectures to improve the ML pipeline, transformer architectures that optimize computer vision (a superset of machine vision), and mobile devices incorporating computer vision technology on the edge,” Synopsys’ Andersen said.
🧠🦾 RT-2: New model translates vision and language into action
Robotic Transformer 2 (RT-2) is a novel vision-language-action (VLA) model that learns from both web and robotics data, and translates this knowledge into generalised instructions for robotic control.
High-capacity vision-language models (VLMs) are trained on web-scale datasets, making these systems remarkably good at recognising visual or language patterns and operating across different languages. But for robots to achieve a similar level of competency, they would need to collect robot data, first-hand, across every object, environment, task, and situation.
In our paper, we introduce Robotic Transformer 2 (RT-2), a novel vision-language-action (VLA) model that learns from both web and robotics data, and translates this knowledge into generalised instructions for robotic control, while retaining web-scale capabilities.
📐 UCLA Researchers Propose PhyCV: A Physics-Inspired Computer Vision Python Library
In the latest innovation, Jalali-Lab @ UCLA has developed a new Python library called PhyCV, which is the first Physics-based Computer vision Python library. This unique library uses algorithms based on the laws and equations of physics to analyze pictorial data. These algorithms imitate how light passes through several physical materials and are based on mathematical equations rather than a series of hand-crafted rules. The algorithms in PhyCV are built on the principles of a rapid data acquisition method called the photonic time stretch.
The three algorithms included in PhyCV are – Phase-Stretch Transform (PST) algorithm, Phase-Stretch Adaptive Gradient-Field Extractor (PAGE) algorithm, and Vision Enhancement via Virtual diffraction and coherent Detection (VEViD) algorithm.
Behind the A.I. tech making BMW vehicle assembly more efficient
Vision retrofits are the quick automation wins you should do now
A vision retrofit is to install the latest AI-powered vision technologies to optimize the performance of a robotic cell. These technologies can significantly improve productivity by allowing robots to operate faster in a wider range of operating conditions and account for randomness intelligently. By upgrading existing 3D camera or fixtured setups with this AI-powered solution, companies can improve their operations and gain a competitive edge.
With AI-powered vision you only need to add an extrusion above the cell for 2D cameras (or just remove the existing 3D camera). With a focused investment under $50K, a few hours of downtime and a project time under six weeks, you could have a cell performing faster and more reliably.
Meta-Transformer: A Unified Framework for Multimodal Learning
Multimodal learning aims to build models that can process and relate information from multiple modalities. Despite years of development in this field, it still remains challenging to design a unified network for processing various modalities (e.g. natural language, 2D images, 3D point clouds, audio, video, time series, tabular data) due to the inherent gaps among them. In this work, we propose a framework, named Meta-Transformer, that leverages a frozen encoder to perform multimodal perception without any paired multimodal training data. In Meta-Transformer, the raw input data from various modalities are mapped into a shared token space, allowing a subsequent encoder with frozen parameters to extract high-level semantic features of the input data. Composed of three main components: a unified data tokenizer, a modality-shared encoder, and task-specific heads for downstream tasks, Meta-Transformer is the first framework to perform unified learning across 12 modalities with unpaired data. Experiments on different benchmarks reveal that Meta-Transformer can handle a wide range of tasks including fundamental perception (text, image, point cloud, audio, video), practical application (X-Ray, infrared, hyperspectral, and IMU), and data mining (graph, tabular, and time-series). Meta-Transformer indicates a promising future for developing unified multimodal intelligence with transformers.
🧠📹 What Sets Toshiba’s Ceramic Balls Apart? The AI Quality Inspection System
Bearings cannot be easily replaced once a vehicle is assembled. In the U.S., bearings used in EVs are expected to be of high enough quality to withstand long distances. One issue that can occur with EVs, however, is the “electric corrosion” of the bearings that mount the various vital parts of the vehicle onto the motor—a serious issue, as it can lead to the breakdown of the vehicle. High-performance bearings would drive the widespread use of EVs, and contribute to the push towards carbon neutrality. The electrical corrosion phenomenon had hampered these efforts, but not anymore—therein lies the beauty of Toshiba’s ceramic balls.
“Our ceramic balls go through slight changes about every year and a half due to changes in material and other factors. To keep up the accuracy of the quality inspections, we have to continually update the AI system itself. The MLOps system automates that process,” says Kobatake.
“We’ve been able to dramatically reduce the time spent on these inspections. Ceramic balls are expensive compared to their metal counterparts. They have so many different strengths, and yet they haven’t been able to replace the metal ones precisely because of this particular issue. If we’re able to reduce the cost through AI quality inspection, we’ll be able to lower the price of the products themselves,” says Yamada.
Apera AI & Mitsubishi Electric Automation Making Robotic Vision Simple
ImageBind: One Embedding Space To Bind Them All
We present ImageBind, an approach to learn a joint embedding across six different modalities - images, text, audio, depth, thermal, and IMU data. We show that all combinations of paired data are not necessary to train such a joint embedding, and only image-paired data is sufficient to bind the modalities together. ImageBind can leverage recent large scale vision-language models, and extends their zero-shot capabilities to new modalities just by using their natural pairing with images. It enables novel emergent applications ‘out-of-the-box’ including cross-modal retrieval, composing modalities with arithmetic, cross-modal detection and generation. The emergent capabilities improve with the strength of the image encoder and we set a new state-of-the-art on emergent zero-shot recognition tasks across modalities, outperforming specialist supervised models. Finally, we show strong few-shot recognition results outperforming prior work, and that ImageBind serves as a new way to evaluate vision models for visual and non-visual tasks.
Improving Image Resolution At The Edge
AI vision for print quality inspection on bottles
Basler Lens selector: Match the right lens for your camera
How is 3D machine vision transforming manufacturing processes?
3D machine vision employs 3D cameras that provide robots with data and information pertaining to particular parts. These three-dimensional cameras can be installed at various locations to create 360-degree, multi-angle images for surface and volume inspection.
The topographical map results from reflected laser displacement. Taking images from two distinct angles facilitates you in getting the 3D data of the image. Then, the separation between each perspective in 3D space is computed. There’s some installed software that can do some substantial image processing and analysis. To evaluate an object with machine vision software, a PC-based machine vision system is hardwired to vision cameras and image capture boards.
AI Driven Vision Inspection Automation for Bevel Gears
Edge Learning: AI for Industrial Machine Vision Made Easy
Plastic Bottles Defect Inspection Using Omron FH Vision System with AI
How machine vision works in RIBE Anlagentechnik’s camera-monitored assembly facility
The German company RIBE Anlagentechnik develops innovative assembly systems, including inspection systems, for bumpers. SICK’s machine vision helps to identify the individual components, and it also monitors each work operation. This particular system concept could prove revolutionary for other manufacturers and suppliers as well.
As the level of individualization in production areas increases, so does the importance of special-purpose systems with innovative potential. RIBE Anlagentechnik specializes in delivering added value to its end customers. The company has demonstrated its specific strengths in technologies associated with assembly and inspection systems for vehicle interiors/exteriors and related components. Managing Director Dietmar Heckel regards the cobot and robot technologies with innovative Industry 4.0 solutions and digitalization concepts not only as a supporting pillar of RIBE Anlagentechnik, but also as a cross-sectoral growth field.
Where Four-Legged Robot Dogs Are Finding Work
High-Performance Machine Vision: Versatile lighting for subtle surface defects
Fabs Drive Deeper Into Machine Learning
For the past couple decades, semiconductor manufacturers have relied on computer vision, which is one of the earliest applications of machine learning in semiconductor manufacturing. Referred to as Automated Optical Inspection (AOI), these systems use signal processing algorithms to identify macro and micro physical deformations.
Defect detection provides a feedback loop for fab processing steps. Wafer test results produce bin maps (good or bad die), which also can be analyzed as images. Their data granularity is significantly larger than the pixelated data from an optical inspection tool. Yet test results from wafer maps can match the splatters generated during lithography and scratches produced from handling that AOI systems can miss. Thus, wafer test maps give useful feedback to the fab.
3D Vision Technology Advances to Keep Pace With Bin Picking Challenges
When a bin has one type of object with a fixed shape, bin picking is straightforward, as CAD models can easily recognize and localize individual items. But randomly positioned objects can overlap or become entangled, presenting one of the greatest challenges in bin picking. Identifying objects with varying shapes, sizes, colors, and materials poses an even larger challenge, but by deploying deep learning algorithms, it is possible to find and match objects that do not conform to one single geometrical description but belong to a general class defined by examples, according to Andrea Pufflerova, Public Relations Specialist at Photoneo.
“A well-trained convolutional neural network (CNN) can recognize and classify mixed and new types of objects that it has never come across before,”
Vision Cameras Inspect Disk Drive Assemblies
Once manufactured, an HDD is carefully fitted and sealed in a metal or plastic case. The case ensures that all drive components are perfectly secured in place and their mechanics work well over the lifetime of the product. It also protects the sensitive disks from dust, humidity, shock and vibration.
An HDD case must be defect-free and have perfectly machined thread holes to perform these functions, according to Somporn Kornwong, a manager at Flexon. In 2019 his company developed Visual Machine Inspection (VMI) for a manufacturer so it can quickly and thoroughly inspect each case it produces.
Simplify Deep Learning Systems with Optimized Machine Vision Lighting
Deep learning cannot compensate for or replace quality lighting. This experiment’s results would hold true over a wide variety of machine vision applications. Poor lighting configurations will result in poor feature extraction and increased defect detection confusion (false positives).
Several rigorous studies show that classification accuracy reduces with image quality distortions such as blur and noise. In general, while deep neural networks perform better than or on par with humans on quality images, a network’s performance is much lower than a human’s when using distorted images. Lighting improves input data, which greatly increases the ability of deep neural network systems to compare and classify images for machine vision applications. Smart lighting — geometry, pattern, wavelength, filters, and more — will continue to drive and produce the best results for machine vision applications with traditional or deep learning systems.
Perceiver: General Perception with Iterative Attention
Biological systems perceive the world by simultaneously processing high-dimensional inputs from modalities as diverse as vision, audition, touch, proprioception, etc. The perception models used in deep learning on the other hand are designed for individual modalities, often relying on domain-specific assumptions such as the local grid structures exploited by virtually all existing vision models. These priors introduce helpful inductive biases, but also lock models to individual modalities. In this paper we introduce the Perceiver - a model that builds upon Transformers and hence makes few architectural assumptions about the relationship between its inputs, but that also scales to hundreds of thousands of inputs, like ConvNets. The model leverages an asymmetric attention mechanism to iteratively distill inputs into a tight latent bottleneck, allowing it to scale to handle very large inputs. We show that this architecture is competitive with or outperforms strong, specialized models on classification tasks across various modalities: images, point clouds, audio, video, and video+audio. The Perceiver obtains performance comparable to ResNet-50 and ViT on ImageNet without 2D convolutions by directly attending to 50,000 pixels. It is also competitive in all modalities in AudioSet.
Tilling AI: Startup Digs into Autonomous Electric Tractors for Organics
Ztractor offers tractors that can be configured to work on 135 different types of crops. They rely on the NVIDIA Jetson edge AI platform for computer vision tasks to help farms improve plant conditions, increase crop yields and achieve higher efficiency.
AI Vision for Monitoring Applications in Manufacturing and Industrial Environments
In traditional industrial and manufacturing environments, monitoring worker safety, enhancing operator efficiency, and improving quality assurance were physical tasks. Today, AI-enabled machine vision technologies replace many of these inefficient, labor-intensive operations for greater reliability, safety, and efficiency. This article explores how, by deploying AI smart cameras, further performance improvements are possible since the data used to empower AI machine vision comes from the camera itself.
Tools Move up the Value Chain to Take the Mystery Out of Vision AI
Intel DevCloud for the Edge and Edge Impulse offer cloud-based platforms that take most of the pain points away with easy access to the latest tools and software. While Xilinx and others have started offering complete systems-on-module with production-ready applications that can be deployed with tools at a higher level of abstraction, removing the need for some of the more specialist skills.
How the USPS Is Finding Lost Packages More Quickly Using AI Technology from Nvidia
In one of its latest technology innovations, the USPS got AI help from Nvidia to fix a problem that has long confounded existing processes – how to better track packages that get lost within the USPS system so they can be found in hours instead of in several days. In the past, it took eight to 10 people several days to locate and recover lost packages within USPS facilities. Now it is done by one or two people in a couple hours using AI.
Hyperspectral imaging aids precision farming
Remote sensing techniques have exponentially evolved thanks to technological progress with the spread of multispectral cameras. Hyperspectral imaging is the capture and processing of an image at a very high number of wavelengths. While multispectral imaging can evaluate the process with three or four colors (red, green, blue and near infrared), hyperspectral imaging splits the image into tens or hundreds of colors. By using the technique of spectroscopy, which is used to identify materials based on how light behaves when it hits a subject, hyperspectral imaging obtains more spectra of data for each pixel in the image of a scene.
Unlike radiography, hyperspectral imaging is a non-destructive, non-contact technology that can be used without damaging the object being analyzed. For example, a drone with a hyperspectral camera can detect plant diseases, weeds, soil erosion problems, and can also estimate crop yields.
John Deere and Audi Apply Intel’s AI Technology
Identifying defects in welds is a common quality control process in manufacturing. To make these inspections more accurate, John Deere is applying computer vision, coupled with Intel’s AI technology, to automatically spot common defects in the automated welding process used in its manufacturing facilities.
At Audi, automated welding applications range from spot welding to riveting. The widespread automation in Audi factories is part of the company’s goal of creating Industrie 4.0-level smart factories. A key aspect of this goal involves Audi’s recognition that creating customized hardware and software to handle individual use cases is not preferrable. Instead, the company focuses on developing scalable and flexible platforms that allow them to more broadly apply advanced digital capabilities such as data analytics, machine learning, and edge computing.
F-16s Are Now Getting Washed By Robots
The Wilder Systems solution actually leverages technology previously developed for robotic drilling in commercial aircraft manufacturing and converts these components and subsystems into an automated washing system. The main changes have involved the development and addition of robot end-effectors to provide the water and soap spray, waterproofing of the robots themselves, and a robot motion path, which is dependent on the type of aircraft to be cleaned.
Machine learning optimizes real-time inspection of instant noodle packaging
During the production process there are various factors that can potentially lead to the seasoning sachets slipping between two noodle blocks and being cut open by the cutting machine or being packed separately in two packets side by side. Such defective products would result in consumer complaints and damage to the company’s reputation, for which reason delivery of such products to dealers should be reduced as far as possible. Since the machine type upgraded by Tianjin FengYu already produced with a very low error rate before, another aspect of quality control is critical: It must be ensured that only the defective and not the defect-free products are reliably sorted out.
Tractor Maker John Deere Using AI on Assembly Lines to Discover and Fix Hidden Defective Welds
John Deere performs gas metal arc welding at 52 factories where its machines are built around the world, and it has proven difficult to find defects in automated welds using manual inspections, according to the company.
That’s where the successful pilot program between Intel and John Deere has been making a difference, using AI and computer vision from Intel to “see” welding issues and get things back on track to keep John Deere’s pilot assembly line humming along.
Harvesting AI: Startup’s Weed Recognition for Herbicides Grows Yield for Farmers
In 2016, the former dorm-mates at École Nationale Supérieure d’Arts et Métiers, in Paris, founded Bilberry. The company today develops weed recognition powered by the NVIDIA Jetson edge AI platform for precision application of herbicides at corn and wheat farms, offering as much as a 92 percent reduction in herbicide usage.
Driven by advances in AI and pressures on farmers to reduce their use of herbicides, weed recognition is starting to see its day in the sun.
Analysing fruit data in the supply chain has never been more important for business efficiency
Fruit and production data can be used in ways that it has never been done before to improve a company’s efficiency and boost profits, according to global packhouse equipment and automation supplier Tomra Food.
He added that there are several different useful data types at play in a packhouse; production and traceability level data, performance level data, quality data and auditing data. This data can be used to optimise the supply chain and can be used to make decisions and directions in terms of the next big thing that needs to be done. But consumer trends will constantly change the requirements of automation.