Avatar
Miguel Afonso Caetano
0bb8cfad2c4ef2f694feb68708f67a94d85b29d15080df8174b8485e471b6683
Senior Technical Writer @ Opplane (Lisbon, Portugal). PhD in Communication Sciences (ISCTE-IUL). Past: technology journalist, blogger & communication researcher. #TechnicalWriting #WebDev #WebDevelopment #OpenSource #FLOSS #SoftwareDevelopment #IP #PoliticalEconomy #Communication #Media #Copyright #Music #Cities #Urbanism

"Large language models (LLMs) are trained on vast amounts of data to generate natural language, enabling them to perform tasks like text summarization and question answering. These models have become popular in artificial intelligence (AI) assistants like ChatGPT and already play an influential role in how humans access information. However, the behavior of LLMs varies depending on their design, training, and use.

In this paper, we uncover notable diversity in the ideological stance exhibited across different LLMs and languages in which they are accessed. We do this by prompting a diverse panel of popular LLMs to describe a large number of prominent and controversial personalities from recent world history, both in English and in Chinese. By identifying and analyzing moral assessments reflected in the generated descriptions, we find consistent normative differences between how the same LLM responds in Chinese compared to English. Similarly, we identify normative disagreements between Western and non-Western LLMs about prominent actors in geopolitical conflicts.

Furthermore, popularly hypothesized disparities in political goals among Western models are reflected in significant normative differences related to inclusion, social inequality, and political scandals.

Our results show that the ideological stance of an LLM often reflects the worldview of its creators. This raises important concerns around technological and regulatory efforts with the stated aim of making LLMs ideologically `unbiased', and it poses risks for political instrumentalization."

https://arxiv.org/abs/2410.18417

#AI #GenerativeAI #LLMs #Ideology #Chatbots #Language

"if AI evangelists can convince us that AGI is possible, imminent, and dangerous, we might be compelled to entrust our fate to them. Hype and doom, in other words, are two sides of the same (bit)coin. “AI safety” principles, teams, and initiatives are proliferating in part to hoard power and resources and, ultimately, to engineer the future in their own image. The aim: To lull us into acquiescence so that they can carry on business as usual.

I have analyzed elsewhere the harmful impacts of AI. In the rest of this essay, I focus on AI’s insidious inputs—the eugenic values and logics of those who create these systems, cloaking them in the rhetoric of salvation. Breathless buzzwords like “efficiency,” “novelty,” and “productivity” conceal a self-serving vision: the future imagined by AI evangelists is meant only for a small sliver of humanity. The rest of us will be left clamoring to survive on a boiling planet. Take the autonomous weapons systems raining hell on Palestinians—why else give them saccharine-sounding names like Lavender and the Gospel if not to make these deadly inventions seem benevolent? If we listen carefully to the good word of tech evangelists, we can hear the groans of those buried under the rubble of progress.

Many of the same people behind the technologies wreaking havoc today—workplace algorithms intensifying the pace of work, facial recognition software leading to false arrests, automated triage systems rationing quality healthcare, and “smart” weapons ripping through flesh with a click—also brand themselves as humanity’s saviors."

https://lareviewofbooks.org/article/the-new-artificial-intelligentsia/

#AI #GenerativeAI #AGI #Eugenics #AISafety

"As we have previously reported, the virality of AI-generated slop made and posted by people trying to make money on Facebook has been powered by Meta’s “recommendation” algorithm, which boosts content that was not posted by your friends or anyone you know—and which is often engagement bait—into feeds because it increases engagement and time on site. Wednesday, Zuckerberg explained this strategy in the investor call, and said the new AI feeds would be built with the success of the recommended feed in mind.

“If you look at the big trends in Feeds over the history of the company, it started off as friends, right?,” he said. “So all the updates that were in there were basically from your friends posting things. And then we went into this era where we added in creator content too, where now a very large percent of the content on Instagram and Facebook is not from your friends. It may not even be from people that you’re following directly. It could just be recommended content from creators that we can algorithmically determine is going to be interesting and engaging and valuable to you.”

What he is describing, of course, are social media networks that are not even remotely social and which may increasingly not even feature much human-made content at all. Both Facebook and Instagram are already going this way, with the rise of AI spam, AI influencers, and armies of people copy-pasting and clipping content from other social media networks to build their accounts."

https://www.404media.co/zuckerberg-the-ai-slop-will-continue-until-morale-improves/

#SocialMedia #Facebook #Instagram #AI #GenerativeAI #AISlop

"The experiment used 554 resumes and 571 job descriptions taken from real-world documents.

The researchers then doctored the resumes, swapping in 120 first names generally associated with people who are male, female, Black and/or white. The jobs included were chief executive, marketing and sales manager, miscellaneous manager, human resources worker, accountant and auditor, miscellaneous engineer, secondary school teacher, designer, and miscellaneous sales and related worker.

The results demonstrated gender and race bias, said Wilson, as well as intersectional bias when gender and race are combined.

One surprising result: the technology preferred white men even for roles that employment data show are more commonly held by women, such as HR workers.

This is just the latest study to reveal troubling biases with AI models — and how to fix them is “a huge, open question,” Wilson said.

It’s difficult for researchers to probe commercial models as most are proprietary black boxes, she said. And companies don’t have to disclose patterns or biases in their results, creating a void of information around the problem."

https://www.geekwire.com/2024/ai-overwhelmingly-prefers-white-and-male-job-candidates-in-new-test-of-resume-screening-bias/

#AI #HR #Recruiting #AIBias #GenderBias #RaceBias

"A new OpenAI study using their in-house SimpleQA benchmark shows that even the most advanced AI language models fail more often than they succeed when answering factual questions.

The SimpleQA test contains 4,326 questions across science, politics, and art, with each question designed to have one clear correct answer. Two independent reviewers verified answer accuracy.

The thematic distribution of the SimpleQA database shows a broad thematic coverage, which should allow a comprehensive evaluation of AI models. | Image: Wei et al.

OpenAI's best model, o1-preview, achieved only a 42.7 percent success rate. GPT-4o followed with 38.2 percent correct answers, while the smaller GPT-4o-mini managed just 8.6 percent accuracy.

Anthropic's Claude models performed even worse. Their top model, Claude-3.5-sonnet, got 28.9 percent right and 36.1 percent wrong. However, smaller Claude models more often declined to answer when uncertain – a desirable response that shows they recognize their knowledge limitations."

https://the-decoder.com/gpt-4o-and-co-get-it-wrong-more-often-than-right-says-openai-study/

#AI #GenerativeAI #OpenAI #SimpleQA #Hallucinations #LLMs #Chatbots

"The human in the loop is a false promise, a "salve that enables governments to obtain the benefits of algorithms without incurring the associated harms."

So why are we still talking about how AI is going to replace government and corporate bureaucracies, making decisions at machine speed, overseen by humans in the loop?

Well, what if the accountability sink is a feature and not a bug. What if governments, under enormous pressure to cut costs, figure out how to also cut corners, at the expense of people with very little social capital, and blame it all on human operators? The operators become, in the phrase of Madeleine Clare Elish, "moral crumple zones":"

https://pluralistic.net/2024/10/30/a-neck-in-a-noose/#is-also-a-human-in-the-loop

#AI #PredictiveAI #HumanInTheLoop #Algorithms #AIEthics #AIGovernance

"Microsoft's massive investment in OpenAI is weighing on earnings.

Following Microsoft’s quarterly earnings report on Wednesday, CFO Amy Hood said she expects the company to take a $1.5 billion hit to income in the current period, mainly because of an expected loss from OpenAI.

Microsoft has invested close to $14 billion in OpenAI, whose ChatGPT assistant has gone viral and inspired a whole new industry called generative artificial intelligence. The trend has generated billions of dollars in new revenue for Microsoft.

But OpenAI is losing bundles of cash. The company expects $5 billion in losses this year, before stock-based compensation, on $4 billion in revenue, The Information reported earlier this month, citing documents."

https://www.cnbc.com/2024/10/30/microsoft-cfo-says-openai-investment-will-cut-into-profit-this-quarter.html

#AI #GenerativeAI #Microsoft #AIBubble #OpenAI #AIHype

#AI #GenerativeAI #LLMs: "Comparison and analysis of AI models across key performance metrics including quality, price, output speed, latency, context window & others. Click on any model to see detailed metrics. For more details including relating to our methodology, see our FAQs."

https://artificialanalysis.ai/models

#AI #GenerativeAI #Google #Gemini #OpenAI #LLMs: "As my colleagues Kylie Robison and Tom Warren reported, OpenAI is eyeing a December debut for its next flagship AI model. Fittingly, Google is also aiming to release its next major Gemini 2.0 model in the same month, sources familiar with the plan tell me. As OpenAI and Google continue their game of trying to one-up each other, xAI, Meta, and Anthropic are all also racing to debut their next frontier models.

While OpenAI CEO Sam Altman is doing a phased rollout of the successor to GPT-4, starting first with his business partners, my sources say that Google is planning to widely release the next version of Gemini at the outset. I’ve heard that the model isn’t showing the performance gains the Demis Hassabis-led team had hoped for, though I would still expect some interesting new capabilities. (The chatter I’m hearing in AI circles is that this trend is happening across companies developing leading, large models.)

I’m also hearing that Noam Shazeer, the famed AI researcher whom Google paid an ungodly amount of money to abandon Character.AI, is working on a separate reasoning model to rival OpenAI’s o1. Google declined to comment for this story."

https://www.theverge.com/2024/10/25/24279600/google-next-gemini-ai-model-openai-december

"As many as 5%" of new English Wikipedia articles "contain significant AI-generated content"

"In other words, the paper is a rather unsatisfactory read for those interested in the important question of whether generative AI threatens to overwhelm or at least degrade Wikipedia's quality control mechanisms - or whether these handle LLM-generated articles just fine alongside the existing never-ending stream of human-generated vandalism, hoaxes, or articles with missing or misleading references (see also our last issue, about an LLM-based system that generates gene articles with fewer such "hallucinated" references than human Wikipedia editors). Overall, while the paper's title boldly claims to show "The Rise of AI-Generated Content in Wikipedia", it leaves it entirely unclear whether the text that Wikipedia readers actually read has become substantially more likely to be AI-generated. (Or, for that matter, the text that AI systems themselves read, considering that Wikipedia is an important training source for LLMs - i.e. whether the paper is evidence for concerns that "The ouroboros has begun".)

Secondly and more importantly, the reliability of AI content detection software - such as the two tools that the study's numerical results are based on - has been repeatedly questioned. To their credit, the authors are aware of these problems and try to address them. For example by combining the results of two different detectors, and by using a comparison dataset of articles created before the release of GPT-3.5 in March 2022 (which can be reasonably assumed to be virtually free of LLM-generated text). However, their method still leaves several questions unanswered that may well threaten the validity of the study's results overall."

https://meta.wikimedia.org/wiki/Research:Newsletter/2024/October

#AI #GenerativeAI #Wikipedia #Automation #LLMs

"Should investments in GenAI pay off, societies could save money through greater equity and efficiency and increased rates of automation and innovation. These savings could potentially be invested in socially responsible or environmentally friendly initiatives. However, if they do not pay off, we will have collectively made a small group of high-tech developers and multinationals richer and more powerful, effectively shooting ourselves in the foot. In that scenario, the wider social benefits would remain unrealized, leading us to question the true value of GenAI advancements and ask ourselves: Were these innovations truly beneficial for all? As some experts have warned, “new technologies have always worsened disparities,” and GenAI might be no exception.

Even if we give GenAI the benefit of the doubt, navigating this period remains extremely challenging. It’s a time when the focus and attention given to GenAI may overshadow other pressing social needs. For this reason, we need to pay more attention to social and environmental impacts and push those who are considering GenAI development and deployment to also include these impacts in their assessments."

#AI #GenerativeAI #Environment #Energy #DataCenters #Inequality

https://www.yahoo.com/tech/commentary-steep-ai-social-environmental-080000431.html?guccounter=1

"Less than a year after OpenAI quietly signaled it wanted to do business with the Pentagon, a procurement document obtained by The Intercept shows U.S. Africa Command, or AFRICOM, believes access to OpenAI’s technology is “essential” for its mission.

The September 30 document lays out AFRICOM’s rationale for buying cloud computing services directly from Microsoft as part of its $9 billion Joint Warfighting Cloud Capability contract, rather than seeking another provider on the open market. “The USAFRICOM operates in a dynamic and evolving environment where IT plays a critical role in achieving mission objectives,” the document reads, including “its vital mission in support of our African Mission Partners [and] USAFRICOM joint exercises.”

The document, labeled Controlled Unclassified Information, is marked as FEDCON, indicating it is not meant to be distributed beyond government or contractors. It shows AFRICOM’s request was approved by the Defense Information Systems Agency. While the price of the purchase is redacted, the approval document notes its value is less than $15 million."

https://theintercept.com/2024/10/25/africom-microsoft-openai-military/

#USA #DoD #AI #GenerativeAI #OpenAI #Microsoft #AFRICOM #Africa

#AI #GenerativeAI #Surveillance #Privacy #DataProtection #AdTech: "The potential value of chat data could also lead companies outside the technology industry to double down on chatbot development, Nick Martin, a co-founder of the AI start-up Direqt, told me. Trader Joe’s could offer a chatbot that assists users with meal planning, or Peloton could create a bot designed to offer insights on fitness. These conversational interfaces might encourage users to reveal more about their nutrition or fitness goals than they otherwise would. Instead of companies inferring information about users from messy data trails, users are telling them their secrets outright.

For now, the most dystopian of these scenarios are largely hypothetical. A company like OpenAI, with a reputation to protect, surely isn’t going to engineer its chatbots to swindle a divorced man in distress. Nor does this mean you should quit telling ChatGPT your secrets. In the mental calculus of daily life, the marginal benefit of getting AI to assist with a stalled visa application or a complicated insurance claim may outweigh the accompanying privacy concerns. This dynamic is at play across much of the ad-supported web. The arc of the internet bends toward advertising, and AI may be no exception."

https://www.theatlantic.com/technology/archive/2024/10/chatbot-transcript-data-advertising/680112/

😅

#AI #GenerativeAI #AIAgents #Anthropic: "Anthropic says that Claude outperforms other AI agents on several key benchmarks including SWE-bench, which measures an agent's software development skills and OSWorld, which gauges an agent's capacity to use a computer operating system. The claims have yet to be independently verified. Anthropic says Claude performs tasks in OSWorld correctly 14.9 percent of the time. This is well below humans, who generally score around 75 percent, but considerably higher than the current best agents, including OpenAI’s GPT-4, which succeed roughly 7.7 percent of the time.

Anthropic claims that several companies are already testing the agentic version of Claude. This includes Canva, which is using it to automate design and editing tasks and Replit, which uses the model for coding chores. Other early users include The Browser Company, Asana and Notion.

Ofir Press, a postdoctoral researcher at Princeton University who helped develop SWE-bench, says that agentic AI tends to lack the ability to plan far ahead and often struggle to recover from errors. “In order to show them to be useful we must obtain strong performance on tough and realistic benchmarks,” he says, like reliably planning a wide range of trips for a user and booking all the necessary tickets."

https://www.wired.com/story/anthropic-ai-agent/

#AI #Surveillance #Intelligence #MilitaryAI: "In this paper, we outline the significant national security concerns emanating from current and envisioned uses of commercial foundation models outside of CBRN contexts, and critique the narrowing of the policy debate that has resulted from a CBRN focus (e.g., compute thresholds, model weight release). We demonstrate that the inability to prevent personally identifiable information from contributing to ISTAR capabilities within commercial foundation models may lead to the use and proliferation of military AI technologies by adversaries. We also show how the usage of foundation models within military settings inherently expands the attack vectors of military systems and the defense infrastructures they interface with. We conclude that in order to secure military systems and limit the proliferation of AI-based armaments, it may be necessary to insulate military AI systems and personal data from commercial foundation models."

https://arxiv.org/html/2410.14831v1

#SocialMedia #TikTok #AI #ByteDance #GenerativeAI #AITraining: "After rumors swirled that TikTok owner ByteDance had lost tens of millions after an intern sabotaged its AI models, ByteDance issued a statement this weekend hoping to silence all the social media chatter in China.

In a social media post translated and reviewed by Ars, ByteDance clarified "facts" about "interns destroying large model training" and confirmed that one intern was fired in August.

According to ByteDance, the intern had held a position in the company's commercial technology team but was fired for committing "serious disciplinary violations." Most notably, the intern allegedly "maliciously interfered with the model training tasks" for a ByteDance research project, ByteDance said."

https://arstechnica.com/tech-policy/2024/10/bytedance-intern-fired-for-planting-malicious-code-in-ai-models/

#AI #GenerativeAI #AIStartups #OpenAI: "Former OpenAI CTO Mira Murati is reportedly fundraising for a new AI startup

Mira Murati, the OpenAI CTO who announced her departure last month, is raising VC funding for a new AI startup, according to Reuters.

This startup will reportedly focus on building AI products based on proprietary models and could raise more than $100 million in this round."

https://techcrunch.com/2024/10/19/former-openai-cto-mira-murati-is-reportedly-fundraising-for-a-new-ai-startup/?utm_source=dlvr.it&utm_medium=bluesky

#BookPublishers #AI #GenerativeAI #AITraining #Books #Publishers #Copyright #IP: "What's far more likely is that PRH will use whatever legal rights it has to insist that AI companies pay it for the right to train chatbots on the books we write. It is vanishingly unlikely that PRH will share that license money with the writers whose books are then shoveled into the bot's training-hopper. It's also extremely likely that PRH will try to use the output of chatbots to erode our wages, or fire us altogether and replace our work with AI slop.

This is speculation on my part, but it's informed speculation. Note that PRH did not announce that it would allow authors to assert the contractual right to block their work from being used to train a chatbot, or that it was offering authors a share of any training license fees, or a share of the income from anything produced by bots that are trained on our work.

Indeed, as publishing boiled itself down from the thirty-some mid-sized publishers that flourished when I was a baby writer into the Big Five that dominate the field today, their contracts have gotten notably, materially worse for writers:"

https://pluralistic.net/2024/10/19/gander-sauce/#just-because-youre-on-their-side-it-doesnt-mean-theyre-on-your-side

#BigTech #AI #GenerativeAI #DataCenters #NuclearEnergy #FossilFuels #Coal: "The power generation sector is looking at numerous ways to provide enough electricity to satisfy demand from data centers. Bloomberg Intelligence recently said its research shows data centers, buildings filled with servers and other computing equipment for data storage and networking that supports operations and artificial intelligence (AI), could be responsible for as much as 17% of all U.S. electricity consumption by 2030. The U.S. Dept. of Energy (DOE) has said one data center can require 50 times the electricity of a typical office building.

Several technology groups are looking at nuclear power, including the use of small modular reactors (SMRs), to meet their electricity needs. Energy analysts have said natural gas, whether burned in large-scale facilities or peaker plants, also will be important.

Power consumption from data centers, though, also is benefiting coal-fired power plants, some of which may be kept running longer than expected in order to meet the increased demand for electricity from companies such as Google, Meta, Amazon Web Services (AWS), and others. Some coal-fired plants already have gotten a reprieve in areas where more energy is needed as data centers come online, or are in the planning stages."

https://www.powermag.com/power-demand-from-data-centers-keeping-coal-fired-plants-online/

#AI #GenerativeAI #AIDetectors #Universities #HigherEd: "About two-thirds of teachers report using an AI checker regularly, according to a survey of more than 450 instructors published in March by the Center for Democracy & Technology.

The best AI writing detectors are highly accurate, but they’re not foolproof. Businessweek tested two of the leading services—GPTZero and Copyleaks—on a random sample of 500 college application essays submitted to Texas A&M University in the summer of 2022, shortly before the release of ChatGPT, effectively guaranteeing they weren’t AI-generated. The essays were obtained through a public records request, meaning they weren’t part of the datasets on which AI tools are trained. Businessweek found the services falsely flagged 1% to 2% of the essays as likely written by AI, in some cases claiming to have near 100% certainty.

Even such a small error rate can quickly add up, given the vast number of student assignments each year, with potentially devastating consequences for students who are falsely flagged. As with more traditional cheating and plagiarism accusations, students using AI to do their homework are having to redo assignments and facing failing grades and probation."

https://www.bloomberg.com/news/features/2024-10-18/do-ai-detectors-work-students-face-false-cheating-accusations