Researchers develop an automated method for making communications more polite
Jun 30, 2020
Pittsburgh [USA], June 30 : In a tense time when a pandemic rage, politicians wrangle for votes and protesters demand racial justice, a little politeness and courtesy go a long way. Now, researchers have developed an automated method for making communications more polite.
Specifically, the method takes nonpolite directives or requests -- those that use either impolite or neutral language -- and restructures them or adds words to make them more well-mannered. "Send me the data," for instance, might become "Could you please send me the data?"
The researchers at the Carnegie Mellon University will present their study on politeness transfer at the Association for Computational Linguistics annual meeting, which will be held virtually beginning July 5.
The idea of transferring a style or sentiment from one communication to another -- turning negative statements positive, for instance -- is something language technologists have been doing for some time. Shrimai Prabhumoye, a PhD student in CMU's Language Technologies Institute (LTI), said performing politeness transfer has long been a goal.
"It is extremely relevant for some applications, such as if you want to make your emails or chatbot sound more polite or if you're writing a blog," she said. "But we could never find the right data to perform this task."
She and LTI master's students Aman Madaan, Amrith Setlur and Tanmay Parekh solved that problem by generating a dataset of 1.39 million sentences labelled for politeness, which they used for their experiments.
The source of these sentences might seem surprising. They were derived from emails exchanged by employees of Enron, a Texas-based energy company that, until its demise in 2001, was better known for corporate fraud and corruption than for social niceties. But half a million corporate emails became public as a result of lawsuits surrounding Enron's fraud scandal and subsequently have been used as a dataset for a variety of research projects.
But even with a dataset, the researchers were challenged simply to define politeness.
"It's not just about using words such as 'please' and 'thank you,'" Prabhumoye said. Sometimes, it means making language a bit less direct, so that instead of saying "you should do X," the sentence becomes something like "let us do X."
And politeness varies from one culture to the next. It's common for native North Americans to use "please" in requests to close friends, but in Arab culture, it would be considered awkward, if not rude. For their study, the CMU researchers restricted their work to speakers of North American English in a formal setting.
The politeness dataset was analysed to determine the frequency and distribution of words in the polite and nonpolite sentences. Then the team developed a "tag and generate" pipeline to perform politeness transfers. First, impolite or nonpolite words or phrases are tagged and then a text generator replaces each tagged item. The system takes care not to change the meaning of the sentence.
"It's not just about cleaning up swear words," Prabhumoye said of the process. Initially, the system had a tendency to simply add words to sentences, such as "please" or "sorry." If "Please help me" was considered polite, the system considered "Please please please help me" even more polite.
But over time the scoring system became more realistic and the changes became subtler. First-person singular pronouns, such as I, me and mine, were replaced by first-person plural pronouns, such as we, us and our. And rather than position "please" at the beginning of the sentence, the system learned to insert it within the sentence: "Could you please send me the file?"
Prabhumoye said the researchers have released their labelled dataset for use by other researchers, hoping to encourage them to further study politeness.