AI-assisted code can be inherently insecure, study finds
Programmers must be educated about strong coding practicesBy Alfonso Maruccia
Forward-looking: Machine learning algorithms are all the rage now, as they are used to generate any kind of "original" content after being trained on enormous pre-existing datasets. Code-generating AIs, however, could pose a real issue for software security in the future.
AI systems like GitHub Copilot promise to make programmers' lives easier by creating entire chunks of "new" code based on natural-language textual inputs and pre-existing context. But code-generating algorithms can also bring an insecurity factor to the table, as a new study involving several developers has recently found.
The researchers said that when the programmers had access to the Codex AI, the resulting code was more likely incorrect or insecure compared to the "hand-made" solutions conceived by the control group. Furthermore, the programmers with AI-assisted solutions were more likely to say that their insecure code was secure compared to the aforementioned control group.
Neil Perry, a PhD candidate at Stanford and the study lead co-author, said that "code-generating systems are currently not a replacement for human developers." Said developers could be using AI-assisted tools to complete tasks outside their own areas of expertise, or to speed up a programming task they are already skilled in. They should be both concerned, the study author said, and they should always double-check the generated code.
According to Megha Srivastava, a postgraduate Stanford student and the second co-author of the study, Codex is anything but useless: despite the shortcomings of the "stupid" AI, code-generating systems can be useful when employed for low-risk tasks. Furthermore, the programmers involved in the study didn't have a particular expertise in security matters, which could have helped in spotting vulnerable or insecure code, Srivastava said.
AI algorithms could also be fine-tuned to improve their coding suggestions, and companies that develop their own systems can get better solutions with a model generating code more in-line with their own security practices. Code-generating technology is an "exciting" development with many people eager to use it, the study authors said. It's just that there is still a lot of work to be done on finding proper solutions to AI shortcomings.