Statement about Microsoft's new "Responsible AI Standard"

We’ve had a lot of requests for comment on Microsoft's new "Responsible AI Standard", especially as it was framed in the NYT article that followed its announcement. Why? Because Microsoft stated they are retiring capabilities that infer emotional states and identity attributes such as gender, age, smile, facial hair, hair, and makeup.

Our reaction if tl;dr: Azure may be out, but it’s not a slam-dunk.

A range of answers to press questions, distilled:

1. What are your thoughts on Microsoft's new "Responsible AI Standard" as it applies to emotion profiling?

On facial recognition and emotional AI, it is important to read what Microsoft said carefully.

First is that their statement refers to ‘provision of open-ended API access to technology that can scan people’s faces and purport to infer their emotional states based on their facial expressions or movements.’

Second is that the statement about retirement of inference of emotional states applies only to their Azure Face services.

Third is that ‘need to carefully analyse all AI systems that purport to infer people’s emotional states’ is a much weaker statement of intent than “we have stopped all emotional AI development.”

I am no Microsoft insider but the sceptic in me wonders if the Azure retirement was an easy win. Use of computer vision to gauge expressions with view to inferring internal experience of emotion is a faulty product. Why would it be continued when it does not work and causes reputational damage?

Microsoft are right to say that these technologies are more debatable. The nub of critique focusses on the assertions that (1) emotion is biologically based, (2) facial expressions are universal across cultures, and (3) (most important) that there is a causal link between facial expression and interior psychological phenomena.

Where things get more complicated is when expressions are understood in context of other biological states (such as heart rate), where a person is, and who they are with. This is not an assurance of accuracy (there isn’t one), but it is the direction of travel for the overall emotional AI industry, i.e., to understand expressions in context. This points to a fundamental problem with how these technologies are critiqued. Accuracy, bias and racism as noted in the press releases are of course important, but there are other human rights, not least the right to mental integrity.

How does this apply to Microsoft? Well, for one, Microsoft has long had an interest in technology for the workplace. Their ambition is not simply office software and communication, but management and experience of the workplace. This includes scoring emotion expressions, physiological states, behaviour, participant histories, and connections with others, but also sensing of environmental factors such temperature, air quality, noise, seating comfort, and room size in relation to number of participants (see the 2020 patent ‘Meeting Insight Computing System’).

Emotion in this setting would be inferred potentially through expressions, but also triangulated with assessment of body language, tone of voice, and explicit statements. In my view, even if this came to pass, this too would fail as even the most positive work meetings are informed by contexts of hierarchy, power, existing relationships, institutional histories and, perhaps most importantly, what is unspoken in meetings. These insights are hard to define, never-mind measure.

If this were just one patent it might be written off as an experimental interest, but there is a broader interest in using emotion, such as in online video calls to analyse facial responses (including emotion expressions) and head gestures (such as nodding in agreement or shaking in disagreement) of an audience, spotlighting for the speaker audience members who are expressive (AffectiveSpotlight) and Metaverse-based interests (Mesh for Teams).

I would be very interested to know whether Microsoft will pull all forms of emotion and related psycho-physiological sensing from their entire suite of operations, i.e., work, automobility, communication, education, and mixed reality development. This would be a slam-dunk.

All of this said, the statement of ‘actionable guidance for our teams that goes beyond the high-level principles that have dominated the AI landscape to date’ is a good one. A problem that has dominated corporate AI ethics is that values such as fairness, accountability and transparency sound good in public relations and stakeholder presentations, but mean very little to either developers tasked with building products or to those seeking to hold corporations to account.

2. Do Microsoft's efforts go far enough?

What I would have liked to have seen is companywide halt of any system that processes data about emotion or related psycho-physiological states. The only potential caveat is where expressions might be rendered from cameras onto an in-world avatar. The key difference is that there would be no labelling of expressions (e.g. angry, happy, etc.) and nothing recorded, so just being for in-world display purposes.

3. Why has Microsoft published this now?

The NYT article on the topic notes European development of its ‘AI Act’. Early drafts of this are showing a keen interest in emotion (in Article 1 no less) with usage in key sectors such as work and education being labelled as high risk. I’m not an expert on US governance and AI law, but I’m mindful too of the FTC and that, since the Biden administration took office, it is now chaired by Lina Khan, and features Meredith Whittaker as senior advisor who has long been critical of emotional AI.

On internal pressure, it’s hard to say. Certainly there are figures working on emotion-based products that have concern about misuse. The problem is unanticipated consequences and I very much doubt that those working on computer vision and emotion thought that these technologies would be used at national borders to gauge micro-expressions and sincerity.

Certainly, too, Microsoft is aware of the writings and arguments of those who have been leading the charge against face-based emotional AI (including Meredith Whittaker). This has translated to a social media groundswell of reaction against face-based emotional AI.

4. What should others be doing?

I’ve been looking at this sector now for years and the problem is still the same: usage of these technologies will lead to reputational damage. This applies not only to companies developing these technologies, but to those who buy and use them. Just don’t, unless you have a radically pro-social idea of how it could work (I’ve developed non-face-based ideas on this for a book to be titled Automated Empathy, out later this year with Oxford University Press). It’s worth me adding, too, that at the Emotional AI Lab we routinely poll the UK public on attitudes to emotional AI in a range of contexts (work, education, toys, cars, mixed reality, security, health, and more). The key signal we see from polling is that young adults have misgivings about it and older people really dislike it.

5. How do you see Microsoft's Responsible AI developing?

Interesting question. I see that they are promoting this as work-in-progress and a live document, so they are undoubtedly watching reaction closely. On the emotion part of this, if the overall reaction by critics is “Great, we won” then little will change. My view is that the question of facial expressions is a smaller part of an industry that wants to know and interact with people in increasingly intimate and invasive ways. This is the time to pressure Microsoft for further change and, given change recent changes in how international human rights bodies are approaching these technologies, and key regional law, there is chance for further development.

Microsoft’s Responsible AI Standard graphic

Andrew McStayJune 23, 2022