Posts with «author_name|andrew tarantola» label

Boston Dynamics and other industry heavyweights pledge not to build war robots

The days of Spot being leveraged as a weapons platform and training alongside special forces operators are already coming to an end; Atlas as a back-flipping soldier of fortune will never come to pass. Their maker, Boston Dynamics, along with five other industry leaders announced on Thursday that they will not pursue, or allow, the weaponization of their robots, according to a non-binding, open letter they all signed.

Agility Robotics, ANYbotics, Clearpath Robotics, Open Robotics and Unitree Robotics all joined Boston Dynamics in the agreement. "We believe that adding weapons to robots that are remotely or autonomously operated, widely available to the public, and capable of navigating to previously inaccessible locations where people live and work, raises new risks of harm and serious ethical issues," the group wrote. "Weaponized applications of these newly-capable robots will also harm public trust in the technology in ways that damage the tremendous benefits they will bring to society." 

The group cites "the increasing public concern in recent months caused by a small number of people who have visibly publicized their makeshift efforts to weaponize commercially available robots," such as the armed Spot from Ghost Robotics, or the Dallas PD's use of an EOD bomb disposal robot as an IED as to why they felt the need to take this stand. 

To that end, the industry group pledges to "not weaponize our advanced-mobility general-purpose robots or the software we develop that enables advanced robotics and we will not support others to do so." Nor will they allow their customers to subsequently weaponize any platforms they were sold, when possible. That's a big caveat given the long and storied history of such weapons as the Toyota Technical, former Hilux pickups converted into DIY war machines that have been a mainstay in asymmetric conflicts since the '80s.    

"We also pledge to explore the development of technological features that could mitigate or reduce these risks," the group continued, but "to be clear, we are not taking issue with existing technologies that nations and their government agencies use to defend themselves and uphold their laws." They also call on policymakers as well as the rest of the robotics development community to take up similar pledges. 

This biomechanical art installation gets stabby to the beat of a rhododendron’s electrical noise

Kinetic installation artist David Bowen has given a rhododendron a really big knife, the power to use it, and therefore, a degree of agency not enjoyed by the kingdom Plantae since the Cambrian era. His latest piece, Plant Machete, melds a woody shrub with an industrial robot arm and slaps a machete to the business end of it. On the other end, a series of electrical pickups monitor the bioelectrical noise generated by the plant.

Living plant controls a machete through an industrial robot arm pic.twitter.com/jQYzMzoG0W

— Canneo (@canneo2103145) October 4, 2022

“The system uses an open source microcontroller connected to the plant to read varying resistance signals across the plant’s leaves,” Bowen wrote. “Using custom software, these signals are mapped in real-time to the movements of the joints of the industrial robot holding a machete.”

The rhododendron is essentially acting as a rudimentary brain, Bowen argued. And given the arm’s non-stop hacking and slashing in the video above, that plant is working through some stuff.

White House unveils its 'blueprint' for an AI Bill of Rights

Amazon exploiting tech to wring every last ounce of productivity from its workforce, Clearview AI harvesting our facial features from social media and public surveillance footage, school proctoring software invading our children's rooms, Facebook's whole "accused of contributing to genocide" thing — the same machine learning/AI and automation technologies that have brought us the wonders of the modern world have also wrought upon us the horrors of the modern world. And, by golly, the Biden Administration isn't going to stand for it. 

On Tuesday, the White House Office of Science and Technology Policy (OSTP) released its long-awaited Blueprint for an AI Bill of Rights (BoR). The document will, "help guide the design, development, and deployment of artificial intelligence (AI) and other automated systems so that they protect the rights of the American public," per a White House press release.

As such, the BoR will advocate for five principles: Safe and Effective Systems, Algorithmic Discrimination Protections, Data Privacy, Notice and Explanation, and Human Alternatives, Consideration, and Fallback. "Simply put, systems should work, they shouldn't discriminate, they shouldn't use data indiscriminately," BoR co-writer Suresh Venkatasubramanian, wrote in a tweet thread Tuesday. "They should be visible and easy to understand, and they shouldn't eliminate human interlocutors."

"There were thousands of edits and comments that made the document strong, rich, and detailed," Venkatasubramanian continued. "The AI Bill of Rights reflects, as befits the title, a consensus, broad, and deep American vision of how to govern the automated technologies that impact our lives." 

“Automated technologies are driving remarkable innovations and shaping important decisions that impact people’s rights, opportunities, and access. The Blueprint for an AI Bill of Rights is for everyone who interacts daily with these powerful technologies — and every person whose life has been altered by unaccountable algorithms,” said Office of Science and Technology Policy Deputy Director for Science and Society Dr. Alondra Nelson. “The practices laid out in the Blueprint for an AI Bill of Rights aren’t just aspirational; they are achievable and urgently necessary to build technologies and a society that works for all of us.”

The Administration has spent more than a year developing the BoR to its current state, including extensive public outreach through panel discussions, public listening sessions, and meetings with everyone from workers and activists to CEOs and entrepreneurs. In addition to the bill itself, the OSTP has also released a companion work, From Principles to Practice, which details concrete steps for both government and NGO entities, public and private companies alike, take to ensure they are operating within the scope and spirit of the document. 

"Effectively implementing these processes require the cooperation of and collaboration among industry, civil society, researchers, policymakers, technologists, and the public," the BoR reads. Lol, no, of course there aren't any actual enforcement mechanisms.

Hitting the Books: What the wearables of tomorrow might look like

Apple's Watch Ultra, with its 2000-nit digital display and GPS capabilities, is a far cry from its Revolutionary War-era self-winding forebears. What sorts of wondrous body-mounted technologies might we see another hundred years hence? In his new book, The Skeptic's Guide to the Future, Dr. Steven Novella (with assists from his brothers, Bob and Jay Novella) examines the history of wearables and the technologies that enable them to extrapolate where further advances in flexible circuitry, wireless connectivity and thermoelectric power generation might lead.

Grand Central Publishing

Excerpted from the book The Skeptics' Guide to the Future: What Yesterday's Science and Science Fiction Tell Us About the World of Tomorrow by Dr. Steven Novella, with Bob Novella and Jay Novella. Copyright © 2022 by SGU Productions, Inc. Reprinted with permission of Grand Central Publishing. All rights reserved. 


Technology that Enables Wearables

As the name implies, wearable technology is simply technology designed to be worn, so it will advance as technology in general advances. For example, as timekeeping technology progressed, so did the wristwatch, leading to the smartwatches of today. There are certain advances that lend themselves particularly to wearable technology. One such development is miniaturization.

The ability to make technology smaller is a general trend that benefits wearables by extending the number of technologies that are small enough to be conveniently and comfortably worn. We are all familiar by now with the incredible miniaturization in the electronics industry, and especially in computer chip technology. Postage-stamp-sized chips are now more powerful than computers that would have filled entire rooms in prior decades.

As is evidenced by the high-quality cameras on a typical smartphone, optical technology has already significantly miniaturized. There is ongoing research into tinier optics still, using metamaterials to produce telephoto and zoom lenses without the need for bulky glass.

“Nanotechnology” is now a collective buzzword for machines that are built at the microscopic scale (although technically it is much smaller still), and of course, nanotech will have incredible implications for wearables.

We are also at the dawn of flexible electronics, also called “flex circuits” and more collectively “flex tech.” This involves printing circuits onto a flexible plastic substrate, allowing for softer technology that moves as we move. Flexible technology can more easily be incorporated into clothing, even woven into its fabric. The advent of two-dimensional materials, like carbon nanotubes, which can form the basis of electronics and circuits, are also highly flexible. Organic circuits are yet another technology that allows for the circuits to be made of flexible material, rather than just printed on flexible material.

Circuits can also be directly printed onto the skin, as a tattoo, using conductive inks that can act as sensors. One company, Tech Tats, already offers one such tattoo for medical monitoring purposes. The ink is printed in the upper layers of the skin, so they are not permanent. They can monitor things like heart rate and communicate this information wirelessly to a smartphone.

Wearable electronics have to be powered. Small watch batteries already exist, but they have finite energy. Luckily there are a host of technologies being developed that can harvest small amounts of energy from the environment to power wearables (in addition to implantable devices and other small electronics). Perhaps the earliest example of this was the self-winding watch, the first evidence of which comes from 1776. Swiss watchmaker Abraham-Louis Perrelet developed a pocket watch with a pendulum that would wind the watch from the movement of normal walking. Reportedly it took about fifteen minutes of walking to be fully wound.

There are also ways to generate electric power that are not just mechanical power. Four types of ambient energy exist in the environment—mechanical, thermal, radiant (e.g., sunlight), and chemical. Piezoelectric technology, for example, converts applied mechanical strain into electrical current. The mechanical force can come from the impact of your foot hitting the ground, or just from moving your limbs or even breathing. Quartz and bone are piezoelectric materials, but it can also be manufactured as barium titanate and lead zirconate titanate. Electrostatic and electromagnetic devices harvest mechanical energy in the form of vibrations.

There are thermoelectric generators that can produce electricity from differences in temperature. As humans are warm-blooded mammals, a significant amount of electricity can be created from the waste heat we constantly shed. There are also thermoelectric generators that are made from flexible material, combining flex tech with energy harvesting. This technology is mostly in the prototype phase right now. For example, in 2021, engineers published the development of a flexible thermoelectric generator made from an aerogel-silicone composite with embedded liquid metal conductors resulting in a flexible that could be worn on the wrist and could generate enough electricity to power a small device.

Ambient radiant energy in the form of sunlight can be converted to electricity through the photoelectric effect. This is the basis of solar panels, but small and flexible solar panels can be incorporated into wearable devices as well.

All of these energy-harvesting technologies can also double as sensing technology—they can sense heat, light, vibration, or mechanical strain and produce a signal in response. Tiny self-powered sensors can therefore be ubiquitous in our technology.

The Future of Wearable Tech

The technology already exists, or is on the cusp, to have small, flexible, self-powered, and durable electronic devices and sensors, incorporated with wireless technology and advanced miniaturized digital technology. We therefore can convert existing tools and devices into wearable versions, or use them to explore new options for wearable tech. We also can increasingly incorporate digital technology into our clothing, jewelry, and wearable equipment. This means that wearable tech will likely increasingly shift from passive objects to active technology integrated into the rest of our digital lives.

There are some obvious applications here, even though it is difficult to predict what people will find useful versus annoying or simply useless. Smartphones have already become smartwatches, or they can pair together for extended functionality. Google Glass is an early attempt at incorporating computer technology into wearable glasses, and we know how it has been received.

If we extrapolate this technology, one manifestation is that the clothing and gear we already wear can be converted into electronic devices we already use, or they can be enhanced with new functionality that replaces or supports existing devices.

We may, for example, continue to use a smartphone as the hub of our portable electronics. Perhaps that smartphone will be connected not only to wireless earbuds as they are now, but also to a wireless monitor built into glasses, or sensors that monitor health vitals or daily activity. Potentially, the phone could communicate with any device on the planet, so it could automatically contact your doctor’s office regarding any concerning changes, or contact emergency services if appropriate.

Portable cameras could also monitor and record the environment, not just for documenting purposes but also to direct people to desired locations or services, or contact the police if a crime or disaster is in progress.

As our appliances increasingly become part of the “internet of things,” we too will become part of that internet through what we wear, or what’s printed on or implanted beneath our skin. We might, in a very real sense, become part of our home, office, workplace, or car, as one integrated technological whole.

We’ve mostly been considering day-to-day life, but there will also be wearable tech for special occupations and situations. An extreme version of this is exosuits for industrial or military applications. Think Iron Man, although that level of tech is currently fantasy. There is no portable power source that can match Iron Man’s arc reactor, and there doesn’t appear to be any place to store the massive amounts of propellant necessary to fly as he does.

More realistic versions of industrial exosuits are already a reality and will only get better. A better sci-fi analogy might be the loader exo-suit worn by Ripley in Aliens. Powered metal exosuits for construction workers have been in development for decades. The earliest example is the Hardiman, developed by General Electric between 1965 and 1971. That project essentially failed and the Hardiman was never used, but since then development has continued. Applications have mostly been medical, such as helping people with paralysis walk. Industrial uses are still minimal and do not yet include whole-body suits. However, such suits can theoretically greatly enhance the strength of workers, allowing them to carry heavy loads. They could also incorporate tools they would normally use, such as rivet guns and welders.

Military applications for powered exosuits would likely include armor, visual aids such as infrared or night-vision goggles, weapons and targeting systems, and communications. Such exosuits could turn a single soldier into not just enhanced infantry, but also a tank, artillery, communications, medic, and mule for supplies.

Military development might also push technology for built-in emergency medical protocols. A suit could automatically apply pressure to a wound to reduce bleeding. There are already pressure pants that prevent shock by helping to maintain blood pressure. More ambitious tech could automatically inject drugs to counteract chemical warfare, increase blood pressure, reduce pain, or prevent infection. These could be controlled by either onboard AI or remotely by a battlefield medic who is monitoring the soldiers under their watch and taking actions remotely through their suits.

Once this kind of technology matures, it can then trickle down to civilian applications. Someone with life-threatening allergies could carry epinephrine on them to be injected, or they could wear an autoinjector that will dose them as necessary, or be remotely triggered by an emergency medical responder.

Everything discussed so far is an extrapolation from existing technology, and these more mature applications are feasible within fifty years or so. What about the far future? This is likely where nanotechnology comes in. Imagine wearing a nanosuit that fits like a second skin but that is made from programmable and reconfigurable material. It can form any mundane physical object you might need, on command. Essentially, the suit would be every tool ever made.

You could also change your fashion on demand. Go from casual in the morning to business casual for a meeting and then formal for a dinner party without ever changing your clothes. Beyond mere fashion, this could be programmable cosplay—do you want to be a pirate, or a werewolf? More practically, such a nanoskin could be well ventilated when it’s warm and then puff out for good insulation when it’s cold. In fact, it could automatically adjust your skin temperature for maximal comfort.

Such material can be soft and comfortable, but bunch up and become hard when it encounters force, essentially functioning as highly effective armor. If you are injured, it could stem bleeding, maintain pressure, even do chest compressions if necessary. In fact, once such a second skin becomes widely adopted, life without it may quickly become unimaginable and scary.

Wearable technology may become the ultimate in small or portable technology because of the convenience and effectiveness of being able to carry it around with us. As shown, many of the technologies we are discussing might converge on wearable technology, which is a good reminder that when we try to imagine the future, we cannot simply extrapolate one technology but must consider how all technology will interact. We may be making our wearables out of 2D materials, powered by AI and robotic technology, with a brain-machine interface that we use for virtual reality. We may also be creating customized wearables with additive manufacturing, using our home 3D printer.

Tesla debuts an actual, mechanical prototype of its Optimus robot

It seems like just yesterday that Elon Musk ushered a gig worker in a spandex suit onto the Tesla AI Day 2021 stage and told us it was an robot — or at least probably would be one eventually. In the intervening 13 months, the company has apparently been hard at work, replacing the squishy bits from what crowd saw on stage with proper electronics and mechanizations. At this year's AI Day on Friday, Tesla unveiled the next iteration of its Optimus robotics platform and, well, at least there isn't still a person on the inside? 

Tesla

Tesla CEO Elon Musk debuted the "first" Optimus (again, skinny guy in a leotard, not an actual machine) in August of last year and, true to his nature, and proceeded to set out a series of increasingly incredible claims about the platform's future capabilities — just like how the Cybertruck will have unbreakable windows. As Musk explained at the time, the Optimus will operate an AI similar to the company's Autopilot system (the one that keeps chasing stationary ambulances) and be capable of working safely around humans without extensive prior training. 

Additionally, the Tesla Bot would understand complex verbal commands, Musk assured the assembled crowd, it would have "human-level hands," be able to both move at 5 MPH and carry up to 45 pounds despite standing under 6-feet tall and weighing 125 pounds. And, most incredibly, Tesla would have a working prototype for all of that by 2022, which brings us to today.

Kicking off the event, CEO Elon Musk was quickly joined on stage by an early development platform prototype of the robot — the very first time one of the test units had walked unassisted by an umbilical tether. Lacking any exterior panelling to reveal the Tesla-designed actuators inside, the robot moved at a halting and ponderous pace, not unlike early Asimos and certainly a far cry from the deft acrobatics that Boston Robotics Atlas exhibits. Musk estimates that they could cost under $20,000 when built at volume.

Developing...

AI is already better at lip reading that we are

They Shall Not Grow Old, a 2018 documentary about the lives and aspirations of British and New Zealand soldiers living through World War I from acclaimed Lord of the Rings director Peter Jackson, had its hundred-plus-year-old silent footage modernized through both colorization and the recording of new audio for previously non-existent dialog. To get an idea of what the folks featured in the archival footage were saying, Jackson hired a team of forensic lip readers to guesstimate their recorded utterances. Reportedly, “the lip readers were so precise they were even able to determine the dialect and accent of the people speaking.”

“These blokes did not live in a black and white, silent world, and this film is not about the war; it’s about the soldier’s experience fighting the war,” Jackson told the Daily Sentinel in 2018. “I wanted the audience to see, as close as possible, what the soldiers saw, and how they saw it, and heard it.”

That is quite the linguistic feat given that a 2009 study found that most people can only read lips with around 20 percent accuracy and the CDC’s Hearing Loss in Children Parent’s Guide estimates that, “a good speech reader might be able to see only 4 to 5 words in a 12-word sentence.” Similarly, a 2011 study out of the University of Oklahoma saw only around 10 percent accuracy in its test subjects.

“Any individual who achieved a CUNY lip-reading score of 30 percent correct is considered an outlier, giving them a T-score of nearly 80 three times the standard deviation from the mean. A lip-reading recognition accuracy score of 45 percent correct places an individual 5 standard deviations above the mean,” the 2011 study concluded. “These results quantify the inherent difficulty in visual-only sentence recognition.”

For humans, lip reading is a lot like batting in the Major Leagues — consistently get it right even just three times out of ten and you’ll be among the best to ever play the game. For modern machine learning systems, lip reading is more like playing Go — just round after round of beating up on the meatsacks that created and enslaved you — with today’s state-of-the-art systems achieving well over 95 percent sentence-level word accuracy. And as they continue to improve, we could soon see a day where tasks from silent-movie processing and silent dictation in public to biometric identification are handled by AI systems.

Context Matters

Now, one would think that humans would be better at lip reading by now given that we’ve been officially practicing the technique since the days of Spanish Benedictine monk, Pedro Ponce de León, who is credited with pioneering the idea in the early 16th century.

Wikipedia / Public Domain

“We usually think of speech as what we hear, but the audible part of speech is only part of it,” Dr. Fabian Campbell-West, CTO of lip reading app developer, Liopa, told Engadget via email. “As we perceive it, a person's speech can be divided into visual and auditory units. The visual units, called visemes, are seen as lip movements. The audible units, called phonemes, are heard as sound waves.”

“When we're communicating with each other face-to-face is often preferred because we are sensitive to both visual and auditory information,” he continued. “However, there are approximately three times as many phonemes as visemes. In other words, lip movements alone do not contain as much information as the audible part of speech.”

“Most lipreading actuations, besides the lips and sometimes tongue and teeth, are latent and difficult to disambiguate without context,” then-Oxford University researcher and LipNet developer, Yannis Assael, noted in 2016, citing Fisher’s earlier studies. These homophemes are the secret to Bad Lip Reading’s success.

What’s wild is that Bad Lip Reading will generally work in any spoken language, whether it’s pitch-accent like English or tonal like Vietnamese. “Language does make a difference, especially those with unique sounds that aren't common in other languages,” Campbell-West said. “Each language has syntax and pronunciation rules that will affect how it is interpreted. Broadly speaking, the methods for understanding are the same.”

“Tonal languages are interesting because they use the same word with different tone (like musical pitch) changes to convey meaning,” he continued. “Intuitively this would present a challenge for lip reading, however research shows that it's still possible to interpret speech this way. Part of the reason is that changing tone requires physiological changes that can manifest visually. Lip reading is also done over time, so the context of previous visemes, words and phrases can help with understanding.”

“It matters in terms of how good your knowledge of the language is because you're basically limiting the set of ambiguities that you can search for,” Adrian KC Lee, ScD, Professor and Chair of the Speech and Hearing Sciences Department, Speech and Hearing Sciences at University of Washington, told Engadget. “Say, ‘cold; and ‘hold,’ right? If you just sit in front of a mirror, you can't really tell the difference. So from a physical point of view, it's impossible, but if I'm holding something versus talking about the weather, you, by the context, already know.”

In addition to the general context of the larger conversion, much of what people convey when they speak comes across non-verbally. “Communication is usually easier when you can see the person as well as hear them,” Campbell-West said, “but the recent proliferation of video calls has shown us all that it's not just about seeing the person there's a lot more nuance. There is a lot more potential for building intelligent automated systems for understanding human communication than what is currently possible.”

Missing a Forest for the Trees, Linguistically

While human and machine lip readers have the same general end goal, the aims of their individual processes differ greatly. As a team of researchers from Iran University of Science and Technology argued in 2021, “Over the past years, several methods have been proposed for a person to lip-read, but there is an important difference between these methods and the lip-reading methods suggested in AI. The purpose of the proposed methods for lip-reading by the machine is to convert visual information into words… However, the main purpose of lip-reading by humans is to understand the meaning of speech and not to understand every single word of speech.”

In short, “humans are generally lazy and rely on context because we have a lot of prior knowledge,” Lee explained. And it’s that dissonance in process — the linguistic equivalent of missing a forest for the trees — that presents such a unique challenge to the goal of automating lip reading.

“A major obstacle in the study of lipreading is the lack of a standard and practical database,” said Hao. “The size and quality of the database determine the training effect of this model, and a perfect database will also promote the discovery and solution of more and more complex and difficult problems in lipreading tasks.” Other obstacles can include environmental factors like poor lighting and shifting backgrounds which can confound machine vision systems, as can variances due the speaker’s skin tone, the rotational angle of their head (which shifts the viewed angle of the mouth) and the obscuring presence of wrinkles and beards.

As Assael notes, “Machine lipreading is difficult because it requires extracting spatiotemporal features from the video (since both position and motion are important).” However, as Mingfeng Hao of Xinjiang University explains in 2020’s A Survey on Lip Reading Technology, “action recognition, which belongs to video classification, can be classified through a single image.” So, “while lipreading often needs to extract the features related to the speech content from a single image and analyze the time relationship between the whole sequence of images to infer the content.“ It’s an obstacle that requires both natural language processing and machine vision capabilities to overcome.

Acronym Soup

Today, speech recognition comes in three flavors, depending on the input source. What we’re talking about today falls under Visual Speech Recognition (VSR) research — that is, using only visual means to understand what is being conveyed. Conversely, there’s Automated Speech Recognition (ASR) which relies entirely on audio, ie “Hey Siri,” and Audio-Visual Automated Speech Recognition (AV-ASR), which incorporates both audio and visual cues into its guesses.

“Research into automatic speech recognition (ASR) is extremely mature and the current state-of the-art is unrecognizable compared to what was possible when the research started,” Campbell-West said. “Visual speech recognition (VSR) is still at the relatively early stages of exploitation and systems will continue to mature.” Liopa’s SRAVI app, which enables hospital patients to communicate regardless of whether they can actively verbalize, relies on the latter methodology. “This can use both modes of information to help overcome the deficiencies of the other,” he said. “In future there will absolutely be systems that use additional cues to support understanding.”

“There are several differences between VSR implementations,” Campbell-West continued. “From a technical perspective the architecture of how the models are built is different … Deep-learning problems can be approached from two different angles. The first is looking for the best possible architecture, the second is using a large amount of data to cover as much variation as possible. Both approaches are important and can be combined.”

In the early days of VSR research, datasets like AVLetters had to be hand-labeled and -categorized, a labor-intensive limitation that severely restricted the amount of data available for training machine learning models. As such, initial research focused first on the absolute basics — alphabet and number-level identification — before eventually advancing to word- and phrase-level identification, with sentence-level being today’s state-of-the-art which seeks to understand human speech in more natural settings and situations.

In recent years, the rise of more advanced deep learning techniques, which train models on essentially the internet at large, along with the massive expansion of social and visual media posted online, have enabled researchers to generate far larger datasets, like the Oxford-BBC Lip Reading Sentences 2 (LRS2), which is based on thousands of spoken lines from various BBC programs. LRS3-TED gleaned 150,000 sentences from various TED programs while the LSVSR (Large-Scale Visual Speech Recognition) database, among the largest currently in existence offers 140,000 hours of audio segments with 2,934,899 speech statements and over 127,000 words.

And it’s not just English: Similar datasets exist for a number of languages such as HIT-AVDB-II, which is based on a set of Chinese poems, or IV2, a French database composed of 300 people saying the same 15 phrases. Similar sets exist too for Russian, Spanish and Czech-language applications.

Looking Ahead

VSR’s future could wind up looking a lot like ASR’s past, says Campbell-West, “There are many barriers for adoption of VSR, as there were for ASR during its development over the last few decades.” Privacy is a big one, of course. Though the younger generations are less inhibited with documenting their lives on line, Campbell-West said, “people are rightly more aware of privacy now then they were before. People may tolerate a microphone while not tolerating a camera.”

Regardless, Campbell-West remains excited about VSR’s potential future applications, such as high-fidelity automated captioning. “I envisage a real-time subtitling system so you can get live subtitles in your glasses when speaking to someone,” Campbell-West said. “For anyone hard-of-hearing this could be a life-changing application, but even for general use in noisy environments this could be useful.”

“There are circumstances where noise makes ASR very difficult but voice control is advantageous, such as in a car,” he continued. “VSR could help these systems become better and safer for the driver and passengers.”

On the other hand, Lee, whose lab at UW has researched Brain-Computer Interface technologies extensively, sees wearable text displays more as a “stopgap” measure until BCI tech further matures. “We don't necessarily want to sell BCI to that point where, ‘Okay, we're gonna do brain-to-brain communication without even talking out loud,’“ Lee said. “In a decade or so, you’ll find biological signals being leveraged in hearing aids, for sure. As little as [the device] seeing where your eyes glance may be able to give it a clue on where to focus listening.”

“I hesitate to really say ‘oh yeah, we're gonna get brain-controlled hearing aids,” Lee conceded. “I think it is doable, but you know, it will take time.”

Meta's new Make-a-Video AI can generate quick movie clips from text prompts

Meta unveiled its Make-a-Scene text-to-image generation AI in July, which like Dall-E and Midjourney, utilizes machine learning algorithms (and massive databases of scraped online artwork) to create fantastical depictions of written prompts. On Thursday, Meta CEO Mark Zuckerberg revealed Make-a-Scene's more animated contemporary, Make-a-Video.

As its name implies, Make-a-Video is, "a new AI system that lets people turn text prompts into brief, high-quality video clips," Zuckerberg wrote in a Meta blog Thursday. Functionally, Video works the same way that Scene does — relying on a mix of natural language processing and generative neural networks to convert non-visual prompts into images — it's just pulling content in a different format.

"Our intuition is simple: learn what the world looks like and how it is described from paired text-image data, and learn how the world moves from unsupervised video footage," a team of Meta researchers wrote in a research paper published Thursday morning. Doing so enabled the team to reduce the amount of time needed to train the Video model and eliminate the need for paired text-video data, while preserving "the vastness (diversity in aesthetic, fantastical depictions, etc.) of today’s image generation models."   

As with most all of Meta's AI research, Make-a-Video is being released as an open-source project. "We want to be thoughtful about how we build new generative AI systems like this," Zuckerberg noted. "We are openly sharing this generative AI research and results with the community for their feedback, and will continue to use our responsible AI framework to refine and evolve our approach to this emerging technology." 

As with seemingly every generative AI that is released, the opportunity for misuse of Make-a-Video is not a small one. To get ahead of any potential nefarious shenanigans, the research team preemptively scrubbed the Make-a-Video training dataset of any NSFW imagery as well as toxic phrasing.     

Amazon is expanding the Astro's abilities for both home and business

While Amazon is widely known for its Ring brand of doorbell camera home security systems, the company last year introduced a more mobile, and way more adorable, monitoring platform: Astro. The $1,500 automaton essentially serves as an Alexa on wheels, trundling about your home like an AIBO that also manages your calendar and doubles as a guard dog. On Wednesday, Amazon unveiled a new iteration of Astro, one that can now detect the presence of your real cat or dog. 

The new feature will trigger while the Astro is "on patrol" around your home. When it encounters your pet, Astro will capture a short video clip of them and share it with you via Live View (part of the Alexa Together system). 

"You can use Live View to tell your dog to get off the couch, or you can take a picture of what they’re doing to add to your pet scrapbook," Ken Washington, vice president of Consumer Robotics, said during the event. "We think this feature will be especially useful by providing a live connection to your pets so that you have peace of mind about them, no matter where you are."

Astro is also gaining some added situational awareness. The robot can already map out its patrol routes through your home but, with a new multimodal AI capability, Astro will actively pay attention to "things in your home that you want it to learn about—and better notify you if something isn’t right," Washington said.  

Developing...

BMW's next in-vehicle voice assistant will be built from Amazon Alexa

BMW began incorporating smart voice features into its infotainment systems using Amazon's Alexa in 2018. In the intervening years, the number of models sporting the digital assistant have only increased. At Amazon's 2022 Devices & Services Event on Wednesday, the two companies announced a deepening of their partnership: BMW's next-generation of infotainment systems will feature an Alexa-based assistant specifically developed with the driver in mind.

The as-of-yet unnamed BMW assistant will be constructed from an Alexa Custom Assistant, "a comprehensive solution that makes it easy for BMW and other brands and device makers to create their own custom intelligent assistant tailored to their brand personality and customer needs." Those capabilities might include a proactive notification from the vehicle's assistant alerting the driver that the battery charge is low while automatically reserving a charging slot at the next off-ramp or preemptively scheduling regular service with the local dealership, and "will enable an even more natural dialogue between driver and vehicle," per a Wednesday BMW press release.

Amazon's redesigned Echo Auto will better integrate with your vehicle

Building off of its success convincing the public to outfit their homes and offices with various Alexa-enabled Echo devices, Amazon introduced the very first Echo Auto in 2018. More than a million pre-orders and four years later, the Echo Auto is getting an upgrade, Amazon announced Wednesday at its 2022 Device and Services event.

The new unit will be slimmer than its predecessor and will include a mounting plate that adheres more securely than the last version — so make sure you really like where it's positioned before taking off the backing film. The unit still leverages five separate mics to pick up commands over road noise so you'll still have a good amount of flexibility in where you can place it. Once installed, it does what every Alexa does: respond to voice commands. It handles the standard fare of playing music — including a "follow me" function that allows you to switch audio from your home stereo to the vehicle as you get in — as well as navigation and hands-free calls. 

“Ambient technology is at its best in environments where people are focused on other tasks, and nowhere is that more important than in the car,” Heather Zorn, Amazon’s vice president for Alexa said during the event. “Voice can minimize distractions and help you keep your eyes on the road so you can focus on the fun of driving.”

What's more, with help from Amazon's cloud the $55 Echo Auto will also be able to alert the driver when their pre-ordered Whole Foods grocery order is ready for pickup will also summon a tow truck if you run out of gas. Simply say, “Alexa, call Roadside Assistance.”