wild ace logo
Former President Jimmy Carter Dies At Age 100A new artificial intelligence (AI) model has just achieved human-level results on a test designed to measure “general intelligence”. On December 20, OpenAI’s o3 system scored 85% on the ARC-AGI benchmark , well above the previous AI best score of 55% and on par with the average human score. It also scored well on a very difficult mathematics test. Creating artificial general intelligence, or AGI, is the stated goal of all the major AI research labs. At first glance, OpenAI appears to have at least made a significant step towards this goal. While scepticism remains, many AI researchers and developers feel something just changed. For many, the prospect of AGI now seems more real, urgent and closer than anticipated. Are they right? To understand what the o3 result means, you need to understand what the ARC-AGI test is all about. In technical terms, it’s a test of an AI system’s “sample efficiency” in adapting to something new – how many examples of a novel situation the system needs to see to figure out how it works. An AI system like ChatGPT (GPT-4) is not very sample efficient. It was “trained” on millions of examples of human text, constructing probabilistic “rules” about which combinations of words are most likely. The result is pretty good at common tasks. It is bad at uncommon tasks, because it has less data (fewer samples) about those tasks. Until AI systems can learn from small numbers of examples and adapt with more sample efficiency, they will only be used for very repetitive jobs and ones where the occasional failure is tolerable. The ability to accurately solve previously unknown or novel problems from limited samples of data is known as the capacity to generalise. It is widely considered a necessary, even fundamental, element of intelligence. The ARC-AGI benchmark tests for sample efficient adaptation using little grid square problems like the one below. The AI needs to figure out the pattern that turns the grid on the left into the grid on the right. Each question gives three examples to learn from. The AI system then needs to figure out the rules that “generalise” from the three examples to the fourth. These are a lot like the IQ tests sometimes you might remember from school. We don’t know exactly how OpenAI has done it, but the results suggest the o3 model is highly adaptable. From just a few examples, it finds rules that can be generalised. To figure out a pattern, we shouldn’t make any unnecessary assumptions, or be more specific than we really have to be. In theory , if you can identify the “weakest” rules that do what you want, then you have maximised your ability to adapt to new situations. What do we mean by the weakest rules? The technical definition is complicated, but weaker rules are usually ones that can be described in simpler statements . In the example above, a plain English expression of the rule might be something like: “Any shape with a protruding line will move to the end of that line and ‘cover up’ any other shapes it overlaps with.” While we don’t know how OpenAI achieved this result just yet, it seems unlikely they deliberately optimised the o3 system to find weak rules. However, to succeed at the ARC-AGI tasks it must be finding them. We do know that OpenAI started with a general-purpose version of the o3 model (which differs from most other models, because it can spend more time “thinking” about difficult questions) and then trained it specifically for the ARC-AGI test. French AI researcher Francois Chollet, who designed the benchmark, believes o3 searches through different “chains of thought” describing steps to solve the task. It would then choose the “best” according to some loosely defined rule, or “heuristic”. This would be “not dissimilar” to how Google’s AlphaGo system searched through different possible sequences of moves to beat the world Go champion. You can think of these chains of thought like programs that fit the examples. Of course, if it is like the Go-playing AI, then it needs a heuristic, or loose rule, to decide which program is best. There could be thousands of different seemingly equally valid programs generated. That heuristic could be “choose the weakest” or “choose the simplest”. However, if it is like AlphaGo then they simply had an AI create a heuristic. This was the process for AlphaGo. Google trained a model to rate different sequences of moves as better or worse than others. The question then is, is this really closer to AGI? If that is how o3 works, then the underlying model might not be much better than previous models. The concepts the model learns from language might not be any more suitable for generalisation than before. Instead, we may just be seeing a more generalisable “chain of thought” found through the extra steps of training a heuristic specialised to this test. The proof, as always, will be in the pudding. Almost everything about o3 remains unknown. OpenAI has limited disclosure to a few media presentations and early testing to a handful of researchers, laboratories and AI safety institutions. Truly understanding the potential of o3 will require extensive work, including evaluations, an understanding of the distribution of its capacities, how often it fails and how often it succeeds. When o3 is finally released, we’ll have a much better idea of whether it is approximately as adaptable as an average human. If so, it could have a huge, revolutionary, economic impact, ushering in a new era of self-improving accelerated intelligence. We will require new benchmarks for AGI itself and serious consideration of how it ought to be governed. If not, then this will still be an impressive result. However, everyday life will remain much the same. Michael Timothy Bennett , PhD Student, School of Computing, Australian National University and Elija Perrier , Research Fellow, Stanford Center for Responsible Quantum Technology, Stanford University This article is republished from The Conversation under a Creative Commons license. Read the original article .
45 years ago, a Virginia Beach woman was killed. Her family is still waiting for answers.
Incredible PS5 Pro Black Friday sale cuts £300 from the priceOusted Liberal MP Moira Deeming wins high-profile defamation case against Opposition Leader John PesuttoEd Miliband’s department admits cheap green energy plans assume ‘wind blows at full gale’ all year
The Huskies bounced back from an upset loss at the hands of Seattle U that snapped a 19-game win streak against the cross-town rival. Osobor opened the game with a three-point play in the first minute and followed it with a layup and the Huskies raced to a 20-point lead by intermission, 46-26. Washington's bench saw plenty of playing time with four players scoring at least nine points. Diallo led the bench effort with 12 points, five assists and a pair of steals. Wilhelm Briedenbach finished with 10 points and five rebounds. Sebastian Robinson was 5 of 22 from the field, including 0-for-4 from distance, but led the Highlanders (2-12) with 16 points. Tim Moore Jr. added 14 points and Ari Fulton contributed 11. The Huskies will look look for their first Big Ten Conference victory after an 0-2 start when they play host to Maryland on Thursday and No. 24 Illinois on Sunday. NJIT returns home to host Medgar Evers on Saturday. Get poll alerts and updates on the AP Top 25 throughout the season. Sign up here . AP college basketball: https://apnews.com/hub/ap-top-25-college-basketball-poll and https://apnews.com/hub/college-basketball‘Cells at Work!’: Cellular shenanigans make for a fantastic voyage
Queen Letizia of Spain is chic in sleek black gown - as she joins her husband King Felipe VI at gala dinner during state visit to ItalyHouse approves $895B defense bill with military pay raise, ban on transgender care for minorsWEST LAYFAYETTE, Ind. (AP) — Trey Kaufman-Renn had 18 points and Myles Colvin and Camden Heide each scored 13 to lead No. 6 Purdue to an 80-45 rout of Marshall on Saturday. Colvin and Heide were making their first starts of the season for Purdue (5-1). Braden Smith, who was averaging 14.6 points, was scoreless on an 0-for-4 shooting day. Smith had a team-high nine assists. Nate Martin led Marshall (3-2) with nine points, playing 24 minutes before fouling out with several minutes left in the game. The Boilermakers shot 55% in the first half to take a 39-24 halftime lead. However, Purdue made only one field goal in the final nine minutes of the first half. Purdue picked up the intensity in the second half, leading by as many as 41 points. The Boilermakers shot 50% for the game and held the Thundering Herd to 30%. No. 10 NORTH CAROLINA 87, HAWAII 69 HONOLULU (AP) — R.J. Davis scored 14 of his 18 points in the first half and No. 10 North Carolina pulled away from Hawaii. Elliot Cadeau had 17 points on 7-of-8 shooting, Seth Trimble scored 11 of his 13 points after halftime and Ian Jackson added 11 for the Tar Heels (3-1). Davis, an All-American guard, moved into fourth place on North Carolina’s all-time career scoring list. He overtook Sam Perkins with his free throw at the 11:59 mark of the first half. Gytis Nemeiksa led Hawaii with 16 points and had 10 rebounds. Akira Jacobs made three 3-pointers and scored 13 points off the bench. Tanner Christensen had 10 points and 10 rebounds and Marcus Green added 10 points for the Rainbow Warriors (4-1). No. 15 MARQUETTE 880, GEORGIA 69 NASSAU, Bahamas (AP) — David Joplin scored a career-high 29 points and made six 3-pointers, Chase Ross had 14 points and five steals, and No. 15 Marquette beat Georgia. Joplin scored five straight Marquette points to begin a 12-3 run that Stevie Mitchell capped by banking in a shot with 1:33 remaining for a 78-66 lead. Mitchell made a steal at the other end to help seal it. Ben Gold scored a career-high 14 points and Kam Jones had 10 points and seven assists for Marquette (6-0). Jones was coming off the program’s third triple-double in more than 100 seasons when he had 17 points, 13 rebounds and 10 assists in 36 minutes against No. 6 Purdue on Tuesday. Gold’s previous high was 12 points at UConn on Feb. 7, 2023, while Joplin’s was 28 at DePaul on Jan. 28, 2023. Blue Cain scored 17 points and Tyrin Lawrence added 15 for Georgia (5-1). Dakota Leffew had 11 and Silas Demary Jr. 10. The Bulldogs turned it over 18 times, leading to 27 points by Marquette. No. 18 CINCINNATI 81, GEORGIA TECH 58 ATLANTA (AP) — Dillon Mitchell had 14 points and 11 rebounds for his first double-double of the season, and No. 18 Cincinnati beat Georgia Tech. Jizzle James and Cole Hickman also scored 14 points apiece for the Bearcats (5-0), who passed the first true test of the young season against their first major conference opponent in the Yellow Jackets of the ACC. Naithan George made three 3-pointers while scoring 13 points for Georgia Tech (2-3). Duncan Powell added 10 points, while leading scorer Baye Ndogo finished with just five points. No. 25 ILLINOIS 87, Md-Eastern Shire 40 CHAMPAIGN, Ill. (AP) — Will Riley scored his 19 points in the second half and No. 25 Illinois beat Maryland Eastern Shore. Kylan Boswell added 13 points, Tomislav Ivisic had 11 and Morez Johnson Jr. finished with 10 for the Illini (4-1), who shot 25% (10 for 40) from 3-point range but committed just nine turnovers. Tre White grabbed 11 rebounds and Kasparas Jakucionis seven for Illinois, which outrebounded the Hawks 59-38. Jalen Ware scored 10 points and Christopher Flippin had 10 rebounds for Maryland Eastern Shore (2-6), which had its lowest point total of the season. The team’s previous low came in 102-63 loss to Vanderbilt on Nov. 4.
Former President Jimmy Carter, our nation’s 39th chief executive, dies at 100