Posted on Leave a comment

Ransomware is already out of control. AI-powered ransomware could be ‘terrifying.’

Hiring AI experts to automate ransomware could be the next step for well-endowed ransomware groups that are seeking to scale up their attacks.
 

In the perpetual battle between cybercriminals and defenders, the latter have always had one largely unchallenged advantage: The use of AI and machine learning allows them to automate a lot of what they do, especially around detecting and responding to attacks. This leg-up hasn't been nearly enough to keep ransomware at bay, but it has still been far more than what cybercriminals have ever been able to muster in terms of AI and automation.

That’s because deploying AI-powered ransomware would require AI expertise. And the ransomware gangs don’t have it. At least not yet.

But given the wealth accumulated by a number of ransomware gangs in recent years, it may not be long before attackers do bring aboard AI experts of their own, prominent cybersecurity authority Mikko Hyppönen said.

Some of these groups have so much cash — or bitcoin, rather — that they could now potentially compete with legit security firms for talent in AI and machine learning, according to Hyppönen, the chief research officer at cybersecurity firm WithSecure.

Ransomware gang Conti pulled in $182 million in ransom payments during 2021, according to blockchain data platform Chainalysis. Leaks of Conti's chats suggest that the group may have invested some of its take in pricey "zero day" vulnerabilities and the hiring of penetration testers.

"We have already seen [ransomware groups] hire pen testers to break into networks to figure out how to deploy ransomware. The next step will be that they will start hiring ML and AI experts to automate their malware campaigns," Hyppönen told Protocol.

"It's not a far reach to see that they will have the capability to offer double or triple salaries to AI/ML experts in exchange for them to go to the dark side," he said. "I do think it's going to happen in the near future — if I would have to guess, in the next 12 to 24 months."

If this happens, Hyppönen said, "it would be one of the biggest challenges we're likely to face in the near future."

AI for scaling up ransomware

While doom-and-gloom cybersecurity predictions are abundant, with two decades of experience on matters of cybercrime, Hyppönen is not just any prognosticator. He has been with his current company, which until recently was known as F-Secure, since 1991 and has been researching — and vying with — cybercriminals since the early days of the concept.

In his view, the introduction of AI and machine learning to the attacker side would be a distinct change of the game. He's not alone in thinking so.

When it comes to ransomware, for instance, automating large portions of the process could mean an even greater acceleration in attacks, said Mark Driver, a research vice president at Gartner.

Currently, ransomware attacks are often very tailored to the individual target, making the attacks more difficult to scale, Driver said. Even still, the number of ransomware attacks doubled year-over-year in 2021, SonicWall has reported — and ransomware has been getting more successful as well. The percentage of affected organizations that agreed to pay a ransom shot up to 58% in 2021, from 34% the year before, Proofpoint has reported.

However, if attackers were able to automate ransomware using AI and machine learning, that would allow them to go after an even wider range of targets, according to Driver. That could include smaller organizations, or even individuals.

"It's not worth their effort if it takes them hours and hours to do it manually. But if they can automate it, absolutely," Driver said. Ultimately, “it's terrifying.”

The prediction that AI is coming to cybercrime in a big way is not brand new, but it still has yet to manifest, Hyppönen said. Most likely, that's because the ability to compete with deep-pocketed enterprise tech vendors to bring in the necessary talent has always been a constraint in the past.

The huge success of the ransomware gangs in 2021, predominantly Russia-affiliated groups, would appear to have changed that, according to Hyppönen. Chainalysis reports it tracked ransomware payments totaling $602 million in 2021, led by Conti's $182 million. The ransomware group that struck the Colonial Pipeline, DarkSide, earned $82 million last year, and three other groups brought in more than $30 million in that single year, according to Chainalysis.

Hyppönen estimated that less than a dozen ransomware groups might have the capacity to invest in hiring AI talent in the next few years, primarily gangs affiliated with Russia.

‘We would definitely not miss it’

If cybercrime groups hire AI talent with some of their windfall, Hyppönen believes the first thing they'll do is automate the most manually intensive parts of a ransomware campaign. TThe actual execution of a ransomware attack remains difficult, he said.

"How do you get it on 10,000 computers? How do you find a way inside corporate networks? How do you bypass the different safeguards? How do you keep changing the operation, dynamically, to actually make sure you're successful?" Hyppönen said. “All of that is manual."

Monitoring systems, changing the malware code, recompiling it and registering new domain names to avoid defenses — things it takes humans a long time to do — would all be fairly simple to do with automation. "All of this is done in an instant by machines,” Hyppönen said.

That means it should be very obvious when AI-powered automation comes to ransomware, according to Hyppönen.

"This would be such a big shift, such a big change," he said. "We would definitely not miss it."

But would the ransomware groups really decide to go to all this trouble? Allie Mellen, an analyst at Forrester, said she's not as sure. Given how successful ransomware groups are already, Mellen said it's unclear why they would bother to take this route.

"They're having no problem with the approaches that they're taking right now," she said. "If it ain't broke, don't fix it."

Others see a higher likelihood of AI playing a role in attacks such as ransomware. Like defenders, ransomware gangs clearly have a penchant for evolving their techniques to try to stay ahead of the other side, said Ed Bowen, managing director for the AI Center of Excellence at Deloitte.

"I'm expecting it — I expect them to be using AI to improve their ability to get at this infrastructure," Bowen said. "I think that's inevitable."

Lower barrier to entry

While AI talent is in extremely short supply right now, that will start to change in coming years as a wave of people graduate from university and research programs in the field, Bowen noted.

The barriers to entry in the AI field are also going lower as tools become more accessible to users, Hyppönen said.

"Today, all security companies rely heavily on machine learning — so we know exactly how hard it is to hire experts in this field. Especially people who have expertise both in cybersecurity and in machine learning. So these are hard people to recruit," he told Protocol. "However, it's becoming easier to become an expert, especially if you don't need to be a world-class expert."

That dynamic could increase the pool of candidates for cybercrime organizations who are, simultaneously, richer and “more powerful than ever before," Hyppönen said.

Should this future come to pass, it will have massive implications for cyber defenders, in the event that a greater volume of attacks — and attacks against a broader range of targets — will be the result.

Among other things, this would likely mean that the security industry would itself be looking to compete harder than ever for AI talent, if only to try to stay ahead of automated ransomware and other AI-powered threats.

Between attackers and defenders, "you're always leapfrogging each other" on technical capabilities, Driver said. "It's a war of trying to get ahead of the other side."

Posted on Leave a comment

Top 5 Real-World Applications for Natural Language Processing

Emerging technologies have greatly facilitated our daily lives. For instance, when you are making yourself dinner but want to call your Mom for the secret recipe, you don’t have to stop what you are doing and dial the number to make the phone call. Instead, all you need to do is to simply speak out — “Hey Siri, call Mom.” And your iPhone automatically makes the call for you.

The application is simple enough, but the technology behind it could be sophisticated. The magic that makes the aforementioned scenario possible is natural language processing (NLP). NLP is far more than a pillar for building Siri. It can also empower many other AI-infused applications in the real world.

This article first explains what NLP is and later moves on to introduce five real-world applications of NLP.

What is NLP?

From chatbots to Siri, from virtual support agents to knowledge graphs, the application and usage of NLP are ubiquitous in our daily life. NLP stands for “Natural Language Processing”. Simply put, NLP is the ability of a machine to understand human language. It is the bridge that enables humans to directly interact and communicate with machines. NLP is a subfield of artificial intelligence (AI) and in Bill Gates's words, “NLP is the pearl in the crown of AI.”

With the ever-expanding market size of NLP, countless companies are investing heavily in this industry, and their product lines vary. Many different but specific systems for various tasks and needs can be built by leveraging the power of NLP.

The Five Real World NLP Applications

The most popular exciting and flourishing real-world applications of NLP include: Conversational user interface, AI-powered call quality assessment, Intelligent outbound calls, AI-powered call operators, and knowledge graphs, to name a few.

Chatbots in E-commerce

Over five years ago, Amazon already realized the potential benefit of applying NLP to their customer service channels. Back then, when customers had issues with their product orderings, the only way they could resort was by calling the customer service agents. However, what they could get from the other side of the phone was “Your call is important to us. Please hold, we’re currently experiencing a high call load. “ most of the time. Thankfully, Amazon immediately realized the damaging effect this could have on their brand image and tried to build chatbots.

Nowadays, when you want to quickly get, for example, a refund online, there’s a much more convenient way! All you need to do is to activate the Amazon customer service chatbot and type in your ordering information and make a refund request. The chatbot interacts and replies the same way a real human does. Apart from the chatbots that deal with post-sales customer experience, chatbots also offer pre-sales consulting. If you have any questions about the product you are going to buy, you can simply chat with a bot and get the answers.

E-commerce chatbots.
E-commerce chatbots.

With the emergence of new concepts like metaverse, NLP can do more than power AI chatbots. Avatars for customer support in the metaverse rely on the NLP technology. Giving customers more realistic chatting experiences.

Customer support avatar in metaverse.
Customer support avatar in the metaverse.

Conversational User Interface

Another more trendy and promising application is interactive systems. Many well-recognized companies are betting big on CUI ( Conversational user interface). CUI is the general term to describe those user interfaces for computers that can simulate conversations with real human beings.

The most common CUIs in our everyday life are Apple’s Siri, Microsoft’s Cortana, Google’s Google Assistant, Amazon’s Alexa, etc.

Apple’s Siri is a common example of conversational user interface.
Apple’s Siri is a common example of a conversational user interface.

In addition, CUIs can also be embedded into cars, especially EVs (electric vehicles). NIO, an automobile manufacturer dedicated to designing and developing EVs, launched its own set of CUI named NOMI in 2018. Visually, the CUIs in cars can work in the same way as Siri. Drivers can focus on steering the car while asking the CUI to adjust A/C temperature, play a song, lock windows/doors, navigate drivers to the nearest gas station, etc.

Conversational user interface in cars.
The conversational user interface in cars.

The Algorithm Behind

Despite all the fancy algorithms the technical media have boasted about, one of the most fundamental ways to build a chatbot is to construct and organize FAQ pairs(or more straightforwardly, question-answer pairs) and use NLP algorithms to figure out if the user query matches anyone of your FAQ knowledge base. A simple FAQ example would be like this:

Q: Can I have some coffee?

A: No, I’d rather have some ribs.

Now that this FAQ pair is already stored in your NLP system, the user can now simply ask a similar question for example: “coffee, please!”. If your algorithm is smart enough, it will figure out that “coffee, please” has a great resemblance to “Can I have some coffee?” and will output the corresponding answer “No, I’d rather have some ribs.” And that’s how things are done.

For a very long time, FAQ search algorithms are solely based on inverted indexing. In this case, you first do tokenization on the original sentence and put tokens and documents into systems like ElasticSearch, which uses inverted-index for indexing and algorithms like TF-IDF or BM25 for scoring.

This algorithm works just as fine until the deep learning era arrives. One of the most substantial problems with the algorithm above is that neither tokenization nor inverted indexing takes into account the semantics of the sentences. For instance, in the example above, users could say “ Can I have a cup of Cappuccino” instead. Now with tokenization and inverted-indexing, there’s a very big chance that the system won’t recognize “coffee” and “a cup of Cappuccino” as the same thing and would thus fail to understand the sentence. AI engineers have to do a lot of workarounds for these kinds of issues.

But things got much better with deep learning. With pre-trained models like BERT and pipelines like Towhee, we can easily encode all sentences into vectors and store them in a vector database, for example, Milvus, and simply calculate vector distance to figure out the semantic resembles of sentences.

The algorithm behind conversational user interfaces.

AI-powered Call Quality Control

Call centers are indispensable for many large companies that care about customer experience. To better spot issues and improve call quality, assessment is necessary. However, the problem is that call centers of large multi-national companies receive tremendous amounts of inbound calls per day. Therefore, it is impractical to listen to each of the millions of calls and make the evaluation. Most of the time, when you hear “in order to improve our service, this call could be recorded.” from the other end of the phone, it doesn’t necessarily mean your call would be checked for quality of service. In fact, even in big organizations, only 2%-3% of the calls would be replayed and checked manually by quality control people.

A call center. Image source: Pexels by Tima Miroshnichenko.

This is where NLP can help. An AI-powered call quality control engine powered by NLP can automatically spot the issues incalls and can handle massive volumes of calls in a relatively short period of time. The engine helps detect if the call operator uses the proper opening and ending sentences, and avoids that banned slang and taboo words in the call. This would easily increase the check rate from 2%-3% to 100%, with even less manpower and other costs.

With a typical AI-powered call quality control service, users need to first upload the call recordings to the service. Then the technology of Automatic speech recognition (ASR) is used to transcribe the audio files into texts. All the texts are subsequently vectorized using deep learning models and subsequently stored in a vector database. The service compares the similarity between the text vectors and vectors generated from a certain set of criteria such as taboo word vectors and vectors of desired opening and closing sentences. With efficient vector similarity search, handling great volumes of call recordings can be much more accurate and less time-consuming.

Intelligent outbound calls

Believe it or not, some of the phone calls you receive are not from humans! Chances are that it is a robot talking from the other side of the call. To reduce operation costs, some companies might leverage AI phone calls for marketing purposes and much more. Google launched Google Duplex back in 2018, a system that can conduct human-computer conversations and accomplish real-world tasks over the phone. The mechanism behind AI phone calls is pretty much the same as that behind chatbots.

Google assistant.
A user asks the Google Assistant for an appointment, which the Assistant then schedules by having Duplex call the business. Image source: Google AI blog.

In other cases, you might have also heard something like this on the phone:

“Thank you for calling. To set up a new account, press 1. To modify your password to an existing account, press 2. To speak to our customer service agent, press 0.”,

or in recent years, something like (with a strong robot accent):

“Please tell me what I can help you with. For example, You can ask me ‘check the balance of my account’.”

This is known as interactive voice response (IVR). It is an automated phone system that interacts with callers and performs based on the answers and actions of the callers. The callers are usually offered some choices via a menu. And then their choice will decide how the phone call system acts. If the user request is too complex, the system can route callers to a human agent. This can greatly reduce labor costs and save time for companies.

Intents are usually very helpful when dealing with calls like these. An intent is a group of sentences or dialects representing a certain user intention. For example, “weather forecast” can be intent, and this intent can be triggered with different sentences. See the picture of a Google Dialogflow example below. Intents can be organized together to accomplish complicated interactive human-computer conversations. Like booking a restaurant, ordering a flight ticket, etc.

Google Dialogflow.
Google Dialogflow.

AI-powered call operators

By adopting the technology of NLP, companies can carry call operation services to the next level. Conventionally, call operators need to look up a hundred page-long professional manual to deal with each call from customers and solve each of the user problems case by case. This process is extremely time-consuming and for most of the time cannot satisfy callers with desirable solutions. However, with an AI-powered call center, dealing with customer calls can be both cozy and efficient.

AI-aided call operators with greater efficiency.
AI-aided call operators with greater efficiency. Image source: Pexels by MART PRODUCTION.

When a customer dials in, the system immediately searches for the customer and their ordering information in the database so that the call operator can have a general idea of the case, like how old the customer is, their marriage status, things they have purchased in the past, etc. During the conversation, the whole chat will be recorded with a live chat log shown on the screen (thanks to living Automatic Speech Recognition). Moreover, when a customer asks a hard question or starts complaining, the machine will catch it automatically, look into the AI database, and tell you what is the best way to respond. With a decent deep learning model, your service could always give your customer >99% correct answers to their questions and can always handle customers’ complaints with the most proper words.

Knowledge graph

A knowledge graph is an information-based graph that consists of nodes, edges, and labels. Where a node (or a vertex) usually represents an entity. It could be a person, a place, an item, or an event. Edges are the lines connecting the nodes. There are also labels that signify the connection or relationship between a pair of nodes. A typical knowledge graph example is shown below:

A sample knowledge graph. Source: A guide to Knowledge Graphs.

The raw data for constructing a knowledge graph may come from various sources — unstructured docs, semi-structured data, and structured knowledge. Various algorithms must be applied to these data so as to extract entities (nodes) and the relationship between entities (edges). To name a few, one needs to do entity recognition, relations extracting, label mining, entity linking. To build a knowledge graph with data in docs, for instance, we need to first use deep learning pipelines to generate embeddings and store them in a vector database.

Once the knowledge graph is constructed, you can see it as the underlying pillar for many more specific applications like smart search engines, question-answering systems, recommending systems, advertisements, and more.

Endnote

This article introduces the top five real-world NLP applications. Leveraging NLP in your business can greatly reduce operational costs and improve user experience. Of course, apart from the five applications introduced in this article, NLP can facilitate more business scenarios including social media analytics, translation, sentiment analysis, meeting summarizing, and more.

There are also a bunch of NLP+, or more generally, AI+ concepts that are getting more and more popular these few years. For example, with AI + RPA (Robotic process automation). You can easily build smart pipelines that complete workflows automatically for you, such as an expense reimbursement workflow where you just need to upload your receipt, and AI + RPA will do all the rest for you. There’s also AI + OCR, where you just need to take a picture of, say, a contract, and AI will tell you if there’s a mistake in your contract, say, the telephone number of a company doesn’t match the number shown in Google search.

Source

Posted on Leave a comment

Researchers Find Way to Run Malware on iPhone Even When It’s OFF

A first-of-its-kind security analysis of iOS Find My function has demonstrated a novel attack surface that makes it possible to tamper with the firmware and load malware onto a Bluetooth chip that's executed while an iPhone is "off."

The mechanism takes advantage of the fact that wireless chips related to Bluetooth, Near-field communication (NFC), and ultra-wideband (UWB) continue to operate while iOS is shut down when entering a "power reserve" Low Power Mode (LPM).

While this is done so as to enable features like Find My and facilitate Express Card transactions, all the three wireless chips have direct access to the secure element, academics from the Secure Mobile Networking Lab (SEEMOO) at the Technical University of Darmstadt said in a paper.

"The Bluetooth and UWB chips are hardwired to the Secure Element (SE) in the NFC chip, storing secrets that should be available in LPM," the researchers said.

"Since LPM support is implemented in hardware, it cannot be removed by changing software components. As a result, on modern iPhones, wireless chips can no longer be trusted to be turned off after shutdown. This poses a new threat model."

The findings are set to be presented at the ACM Conference on Security and Privacy in Wireless and Mobile Networks (WiSec 2022) this week.

The LPM features, newly introduced last year with iOS 15, make it possible to track lost devices using the Find My network. Current devices with Ultra-wideband support include iPhone 11, iPhone 12, and iPhone 13.

A message displayed when turning off iPhones reads thus: "iPhone remains findable after power off. Find My helps you locate this iPhone when it is lost or stolen, even when it is in power reserve mode or when powered off."

Malware

Calling the current LPM implementation "opaque," the researchers not only sometimes observed failures when initializing Find My advertisements during power off, effectively contradicting the aforementioned message, they also found that the Bluetooth firmware is neither signed nor encrypted.

By taking advantage of this loophole, an adversary with privileged access can create malware that's capable of being executed on an iPhone Bluetooth chip even when it's powered off.

However, for such a firmware compromise to happen, the attacker must be able to communicate to the firmware via the operating system, modify the firmware image, or gain code execution on an LPM-enabled chip over-the-air by exploiting flaws such as BrakTooth.

Put differently, the idea is to alter the LPM application thread to embed malware, such as those that could alert the malicious actor of a victim's Find My Bluetooth broadcasts, enabling the threat actor to keep remote tabs on the target.

"Instead of changing existing functionality, they could also add completely new features," SEEMOO researchers pointed out, adding they responsibly disclosed all the issues to Apple, but that the tech giant "had no feedback."

With LPM-related features taking a more stealthier approach to carrying out its intended use cases, SEEMOO called on Apple to include a hardware-based switch to disconnect the battery so as to alleviate any surveillance concerns that could arise out of firmware-level attacks.

"Since LPM support is based on the iPhone's hardware, it cannot be removed with system updates," the researchers said. "Thus, it has a long-lasting effect on the overall iOS security model."

"Design of LPM features seems to be mostly driven by functionality, without considering threats outside of the intended applications. Find My after power off turns shutdown iPhones into tracking devices by design, and the implementation within the Bluetooth firmware is not secured against manipulation."

Source

Posted on Leave a comment

What is differential privacy in machine learning (preview)?

How differential privacy works

Differential privacy is a set of systems and practices that help keep the data of individuals safe and private. In machine learning solutions, differential privacy may be required for regulatory compliance.

Differential privacy machine learning process.

In traditional scenarios, raw data is stored in files and databases. When users analyze data, they typically use the raw data. This is a concern because it might infringe on an individual's privacy. Differential privacy tries to deal with this problem by adding "noise" or randomness to the data so that users can't identify any individual data points. At the least, such a system provides plausible deniability. Therefore, the privacy of individuals is preserved with limited impact on the accuracy of the data.

In differentially private systems, data is shared through requests called queries. When a user submits a query for data, operations known as privacy mechanisms add noise to the requested data. Privacy mechanisms return an approximation of the data instead of the raw data. This privacy-preserving result appears in a report. Reports consist of two parts, the actual data computed and a description of how the data was created.

Differential privacy metrics

Differential privacy tries to protect against the possibility that a user can produce an indefinite number of reports to eventually reveal sensitive data. A value known as epsilon measures how noisy, or private, a report is. Epsilon has an inverse relationship to noise or privacy. The lower the epsilon, the more noisy (and private) the data is.

Epsilon values are non-negative. Values below 1 provide full plausible deniability. Anything above 1 comes with a higher risk of exposure of the actual data. As you implement machine learning solutions with differential privacy, you want to data with epsilon values between 0 and 1.

Another value directly correlated to epsilon is delta. Delta is a measure of the probability that a report isn’t fully private. The higher the delta, the higher the epsilon. Because these values are correlated, epsilon is used more often.

Limit queries with a privacy budget

To ensure privacy in systems where multiple queries are allowed, differential privacy defines a rate limit. This limit is known as a privacy budget. Privacy budgets prevent data from being recreated through multiple queries. Privacy budgets are allocated an epsilon amount, typically between 1 and 3 to limit the risk of reidentification. As reports are generated, privacy budgets keep track of the epsilon value of individual reports as well as the aggregate for all reports. After a privacy budget is spent or depleted, users can no longer access data.

Reliability of data

Although the preservation of privacy should be the goal, there’s a tradeoff when it comes to usability and reliability of the data. In data analytics, accuracy can be thought of as a measure of uncertainty introduced by sampling errors. This uncertainty tends to fall within certain bounds. Accuracy from a differential privacy perspective instead measures the reliability of the data, which is affected by the uncertainty introduced by the privacy mechanisms. In short, a higher level of noise or privacy translates to data that has a lower epsilon, accuracy, and reliability.

Open-source differential privacy libraries

SmartNoise is an open-source project that contains components for building machine learning solutions with differential privacy. SmartNoise is made up of the following top-level components:

  • SmartNoise Core library
  • SmartNoise SDK library

SmartNoise Core

The core library includes the following privacy mechanisms for implementing a differentially private system:

ComponentDescription
AnalysisA graph description of arbitrary computations.
ValidatorA Rust library that contains a set of tools for checking and deriving the necessary conditions for an analysis to be differentially private.
RuntimeThe medium to execute the analysis. The reference runtime is written in Rust but runtimes can be written using any computation framework such as SQL and Spark depending on your data needs.
BindingsLanguage bindings and helper libraries to build analyses. Currently SmartNoise provides Python bindings.

SmartNoise SDK

The system library provides the following tools and services for working with tabular and relational data:

ComponentDescription
Data Access

Library that intercepts and processes SQL queries and produces reports. This library is implemented in Python and supports the following ODBC and DBAPI data sources:

  • PostgreSQL
  • SQL Server
  • Spark
  • Preston
  • Pandas
ServiceExecution service that provides a REST endpoint to serve requests or queries against shared data sources. The service is designed to allow composition of differential privacy modules that operate on requests containing different delta and epsilon values, also known as heterogeneous requests. This reference implementation accounts for additional impact from queries on correlated data.
Evaluator

Stochastic evaluator that checks for privacy violations, accuracy, and bias. The evaluator supports the following tests:

  • Privacy Test - Determines whether a report adheres to the conditions of differential privacy.
  • Accuracy Test - Measures whether the reliability of reports falls within the upper and lower bounds given a 95% confidence level.
  • Utility Test - Determines whether the confidence bounds of a report are close enough to the data while still maximizing privacy.
  • Bias Test - Measures the distribution of reports for repeated queries to ensure they aren’t unbalanced

Next steps

Learn more about differential privacy in machine learning:

Posted on Leave a comment

Responsible AI – Privacy and Security Requirements

Training data and prediction requests can both contain sensitive information about people / business which has to be protected. How do you safeguard the privacy of the individuals? What steps are taken to ensure that individuals have control of their data? There are regulations in countries to ensure privacy and security.

 In Europe you have the GDPR (General Data Protection Regulations) and in California there is CCPA (California Consumer Privacy Act,). Fundamentally, both give an individual control over its Data and requires that companies should protect the Data being used in the model. When Data processing is based on consent, then am individual has the right to revoke the consent at any time.

 Defending ML Models against attacks – Ensuring privacy of consumer data:

 I have discussed about very briefly about the tools for adversarial training – CleverHans and FoolBox Python libraries here: Model Debugging: Sensitivity Analysis, Adversarial Training, Residual Analysis  . Let us now look at more stringent means of protecting a ML model against attacks. It is important to protect the ML model against attacks, thus, ensuring the privacy and security of data. An ML model may be attacked in different ways – some literature classifies the attacks into: “Information Harms” and “Behavioural Harms”. Information Harm occurs when the information is allowed to leak from the model. There are different forms of Information Harms: Membership Inference, Model Inversion and Model Extraction. In Membership Inference, the attacker can determine if some information is part of the training data or not. In Model Inversion, the attacker can extract all the training data from the model and Model Extraction, the attacker is able to extract the entire model!

 Behavioural Harm occurs when the attacker can change the behaviour of the ML model itself – example: by inserting malicious data. In this post – I have given an example of an autonomous vehicle in this article: Model Debugging: Sensitivity Analysis, Adversarial Training, Residual Analysis

Cryptography | Differential privacy to protect data

You should consider privacy enhancing technologies like Secure Multi Party Computation ,(SMPC) and Fully Homomorphic Encryption (FHE). SMPC involves multiple systems to train or serve the model whilst the actual data is kept secure

In FHE the data is encrypted. Prediction requests involve encrypted data and training of the model is also carried out on encrypted data. This results in heavy computational cost because the data is never decrypted except by the user. Users will send encrypted prediction requests and will receive back an encrypted result. The goal is that using cryptography you can protect the consumers data.

Differential Privacy in Machine Learning

Differential privacy involves protection of the data by adding noise to the data so that the attackers cannot identify the real content. SmartNoise is an open-source project that contains components for building machine learning solutions with differential privacy. SmartNoise is made of following top level components:

✔️Smart Noise Core Library

✔️Smart Noise SDK Library

This is a good read to understand about Differential Privacy: https://docs.microsoft.com/en-us/azure/machine-learning/concept-differential-privacy

 Private Aggregation of Teacher Ensembles (PATE)

This follows the Knowledge Distillation concept that I discussed here: Post 1- Knowledge DistillationPost - 2 Knowldge Distillation. PATE begins by dividing the data into “k” partitions with no overlaps. It then trains k models on that data and then aggregates the results on an aggregate teacher model. During the aggregation for the aggregate teacher, you will add noise to the data and the output.

For deployment, you will use the student model. To train the student model you take unlabelled public data and feed it to the teacher model and the result is labelled data with which the student model is trained. For deployment, you use only the student model.

The process is illustrated in the figure below:

No alt text provided for this image

PATE (Private Aggregation of Teacher Ensembles)

Source

Credits:

Posted on Leave a comment

Employee monitoring software became the new normal during COVID-19. It seems workers are stuck with it

Many employers say they'll keep the surveillance software switched on — even for office workers.


In early 2020, as offices emptied and employees set up laptops on kitchen tables to work from home, the way managers kept tabs on white-collar workers underwent an abrupt change as well.

Bosses used to counting the number of empty desks, or gauging the volume of keyboard clatter, now had to rely on video calls and tiny green "active" icons in workplace chat programs.

In response, many employers splashed out on sophisticated kinds of spyware to claw back some oversight.

"Employee monitoring software" became the new normal, logging keystrokes and mouse movement, capturing screenshots, tracking location, and even activating webcams and microphones.

At the same time, workers were dreaming up creative new ways to evade the software's all-seeing eye.

Now, as workers return to the office, demand for employee tracking "bossware" remains high, its makers say.

Surveys of employers in white-collar industries show that even returned office workers will be subject to these new tools.

What was introduced in the crisis of the pandemic, as a short-term remedy for lockdowns and working from home (WFH), has quietly become the "new normal" for many Australian workplaces.

A game of cat-and-mouse jiggler

For many workers, the surveillance software came out of nowhere.

The abrupt appearance of spyware in many workplaces can be seen in the sudden popularity of covert devices designed to evade this surveillance.

Before the pandemic, "mouse jigglers" were niche gadgets used by police and security agencies to keep seized computers from logging out and requiring a password to access.

Mouse jigglers for sale on eBay
An array of mouse jigglers for sale on eBay.(Supplied: eBay)

Plugged into a laptop's USB port, the jiggler randomly moves the mouse cursor, faking activity when there's no-one there.

When the pandemic hit, sales boomed among WFH employees.

In the last two years, James Franklin, a young Melbourne software engineer, has mailed 5,000 jigglers to customers all over the country — mostly to employees of "large enterprises", he says.

Often, he's had to upgrade the devices to evade an employers' latest methods of detecting and blocking them.

It's been a game of cat-and-mouse jiggler.

"Unbelievable demand is the best way to describe it," he said.

And mouse jigglers aren't the only trick for evading the software.

In July last year, a Californian mum's video about a WFH hack went viral on TikTok.

Leah told how her computer set her status to "away" whenever she stopped moving her cursor for more than a few seconds, so she had placed a small vibrating device under the mouse.

"It's called a mouse mover … so you can go to the bathroom, free from paranoia."

Others picked up the story and shared their tips, from free downloads of mouse-mimicking software to YouTube videos that are intended to play on a phone screen, with an optical mouse resting on top. The movement of the lines in the video makes the cursor move.

"A lot of people have reached out on TikTok," Leah told the ABC.

"There were a lot of people going, 'Oh, my gosh, I can't believe I haven't heard of this before, send me the link.'"

Tracking software sales are up — and staying up

On the other side of the world, in New York, EfficientLab makes and sells an employee surveillance software called Controlio that's widely used in Australia.

It has "hundreds" of Australian clients, said sales manager Moath Galeb.

"At the beginning of the pandemic, there was already a lot of companies looking into monitoring software, but it wasn't such an important feature," he said.

"But the pandemic forced many people to work remotely and the companies started to look into employee monitoring software more seriously."

An online dashboard showing active time and productivity score for a worker
Managers can track employees' productivity scores on a realtime dashboard.(Supplied: Controlio)

In Australia, as in other countries, the number of Controlio clients has increased "two or three times" with the pandemic.

This increase was to be expected — but what surprised even Mr Galeb was that demand has remained strong in recent months.

"They're getting these insights into how people get their work done," he said.

The most popular features for employers, he said, track employee "active time" to generate a "productivity score".

Managers view these statistics through an online dashboard.

Advocates say this is a way of looking after employees, rather than spying on them.

Bosses can see who is "working too many hours", Mr Galeb said.

"Depending on the data, or the insights that you receive, you get to build this picture of who is doing more and doing less."

Nothing new for blue-collar workers

But those being monitored are likely to see things a little differently. 

Ultimately, how the software is used depends on what power bosses have over their workers.

For the increasing number of people in insecure, casualised work, these tools appear less than benign.

In an August 2020 submission to a NSW senate committee investigating the impact of technological change on the future of work, the United Workers Union featured the story of a call centre worker who had been working remotely during the pandemic. 

One day, the employer informed the man that monitoring software had detected his apparent absence for a 45-minute period two weeks earlier.

The submission reads:

Unable to remember exactly what he was doing that particular day, the matter was escalated to senior management who demanded to know exactly where he physically was during this time. This 45-minute break in surveillance caused considerable grief and anxiety for the company. A perceived productivity loss of $27 (the worker's hourly rate) resulted in several meetings involving members of upper management, formal letters of correspondence, and a written warning delivered to the worker.

There were many stories like this one, said Lauren Kelly, who wrote the submission.

"The software is sold as a tool of productivity and efficiency, but really it's about surveillance and control," she said.

"I find it very unlikely it would result in management asking somebody to slow down and do less work."

Ms Kelly, who is now a PhD candidate at RMIT with a focus on workplace technologies including surveillance, says tools for tracking an employee's location and activity are nothing new — what has changed in the past two years is the types of workplaces where they are used.

Before the pandemic, it was more for blue-collar workers. Now, it's for white-collar workers too.

"Once it's in, it's in. It doesn't often get uninstalled," she said.

"The tracking software becomes a ubiquitous part of the infrastructure of management."

The 'quid pro quo' of WFH?

More than half of Australian small-to-medium-sized businesses used software to monitor the activity and productivity of employees working remotely, according to a Capterra survey in November 2020.

That's about on par with the United States.

"There's a tendency in Australia to view these workplace trends as really bad in other places like the United States and China," Ms Kelly said.

"But actually, those trends are already here."

A screenshot of a dashboard showing a graph with different emotions
The latest software claims to monitor employee emotions like happiness and sadness.(Supplied: StaffCircle)

In fact, a 2021 survey suggested Australian employers had embraced location-tracking software more warmly than those of any other country.

Every two years, the international law firm Herbert Smith Freehills surveys thousands of its large corporate clients around the world for an ongoing series of reports on the future of work.

In 2021, it found 90 per cent of employers in Australia monitor the location of employees when they work remotely, significantly more than the global average of less than 80 per cent.

Many introduced these tools having found that during lockdown, some employees had relocated interstate or even overseas without asking permission or informing their manager, said Natalie Gaspar, an employment lawyer and partner at Herbert Smith Freehills.

"I had clients of mine saying that they didn't realise that their employees were working in India or Pakistan," she said.

"And that's relevant because there [are] different laws that apply in those different jurisdictions about workers compensation laws, safety laws, all those sorts of things."

She said that, anecdotally, many of her "large corporate" clients planned to keep the employee monitoring software tools — even for office workers.

"I think that's here to stay in large parts."

And she said employees, in general, accepted this elevated level of surveillance as "the cost of flexibility".

"It's the quid pro quo for working from home," she said.

Is it legal?

The short answer is yes, but there are complications.

There's no consistent set of laws operating across jurisdictions in Australia that regulate surveillance of the workplace.

In New South Wales and the ACT, an employer can only install monitoring software on a computer they supply for the purposes of work.

With some exceptions, they must also advise employees they're installing the software and explain what is being monitored 14 days prior to the software being installed or activated.

In NSW, the ACT and Victoria, it's an offence to install an optical or listening device in workplace toilets, bathroom or change rooms.

South Australia, Tasmania, Western Australia, the Northern Territory and Queensland do not currently have specific workplace surveillance laws in place.

Smile, you're at your laptop

Location tracking software may be the cost of WFH, but what about tools that check whether you're smiling into the phone, or monitor the pace and tone of your voice for depression and fatigue?

These are some of the features being rolled out in the latest generation of monitoring software.

Zoom, for instance, recently introduced a tool that provides sales meeting hosts with a post-meeting transcription and "sentiment analysis".

A screenshot of a sales video with analytics and sentiment analysis
Zoom IQ for Sales offers a breakdown of how the meeting went.(Supplied: Zoom)

Software already on the market trawls email and Slack messages to detect levels of emotion like happiness, anger, disgust, fear or sadness.

The Herbert Smith Freehills 2021 survey found 82 per cent of respondents planned to introduce digital tools to measure employee wellbeing.

A bit under half said they already had processes in place to detect and address wellbeing issues, and these were assisted by technology such as sentiment analysis software.

Often, these technologies are tested in call centres before they're rolled out to other industries, Ms Kelly said.

"Affect monitoring is very controversial and the technology is flawed.

"Some researchers would argue it's simply not possible for AI or any software to truly 'know' what a person is feeling.

"Regardless, there's a market for it and some employers are buying into it."

The movement of the second hand of an analogue wristwatch moves an optical mouse cursor a tiny amount.(Supplied: Reddit)

Back in Melbourne, Mr Franklin remains hopeful that plucky inventors can thwart the spread of bossware.

When companies switched to logging keyboard inputs, someone invented a random keyboard input device.

When managers went a step further and monitored what was happening on employees' screens, a tool appeared that cycled through a prepared list of webpages at regular intervals.

"The sky's the limit when it comes to defeating these systems," he said.

And sometimes the best solutions are low tech.

Recently, an employer found a way to block a worker's mouse jiggler, so he simply taped his mouse to the office fan.

"And it dragged the mouse back and forth.

"Then he went out to lunch."