Zürcher Nachrichten - Inner workings of AI an enigma - even to its creators

EUR -
AED 4.257825
AFN 73.041018
ALL 95.873009
AMD 437.352583
ANG 2.075387
AOA 1063.151672
ARS 1613.58108
AUD 1.673905
AWG 2.089782
AZN 1.973845
BAM 1.954333
BBD 2.334618
BDT 142.577309
BGN 1.981739
BHD 0.437687
BIF 3437.561568
BMD 1.15938
BND 1.487067
BOB 8.009404
BRL 5.97753
BSD 1.159165
BTN 107.581834
BWP 15.765053
BYN 3.447206
BYR 22723.847126
BZD 2.331251
CAD 1.608831
CDF 2660.776779
CHF 0.920201
CLF 0.026806
CLP 1058.468183
CNY 7.967264
CNH 7.972674
COP 4258.889516
CRC 538.925783
CUC 1.15938
CUP 30.723569
CVE 110.722703
CZK 24.516831
DJF 206.04483
DKK 7.472801
DOP 70.143272
DZD 153.949838
EGP 62.050135
ERN 17.390699
ETB 182.022293
FJD 2.613012
FKP 0.879391
GBP 0.871048
GEL 3.118896
GGP 0.879391
GHS 12.753478
GIP 0.879391
GMD 85.21678
GNF 10179.356057
GTQ 8.867307
GYD 242.600498
HKD 9.086698
HNL 30.862654
HRK 7.536546
HTG 152.154348
HUF 383.24522
IDR 19636.418305
ILS 3.636337
IMP 0.879391
INR 107.408495
IQD 1518.208052
IRR 1529077.238778
ISK 144.412139
JEP 0.879391
JMD 183.321638
JOD 0.822032
JPY 183.994179
KES 150.777075
KGS 101.387493
KHR 4649.699016
KMF 494.765613
KPW 1043.376276
KRW 1755.046257
KWD 0.358781
KYD 0.966029
KZT 551.044098
LAK 25451.296237
LBP 103411.591452
LKR 365.40421
LRD 213.152204
LSL 19.645662
LTL 3.423348
LVL 0.701297
LYD 7.390987
MAD 10.811232
MDL 20.418822
MGA 4840.411584
MKD 61.660687
MMK 2435.168612
MNT 4142.142525
MOP 9.359182
MRU 46.52622
MUR 54.247415
MVR 17.912336
MWK 2013.843377
MXN 20.666755
MYR 4.66181
MZN 74.153892
NAD 19.645738
NGN 1599.978701
NIO 42.560709
NOK 11.261423
NPR 172.131476
NZD 2.01633
OMR 0.445773
PAB 1.15919
PEN 4.032302
PGK 5.053699
PHP 69.770824
PKR 323.696816
PLN 4.283526
PYG 7528.253101
QAR 4.225358
RON 5.098146
RSD 117.335075
RUB 93.098607
RWF 1693.854115
SAR 4.351688
SBD 9.286604
SCR 16.275631
SDG 696.7875
SEK 10.912675
SGD 1.487316
SHP 0.869835
SLE 28.512249
SLL 24311.630526
SOS 662.585427
SRD 43.319095
STD 23996.824298
STN 24.926669
SVC 10.142345
SYP 128.398205
SZL 19.634144
THB 37.807266
TJS 11.084355
TMT 4.05783
TND 3.378723
TOP 2.791508
TRY 51.582667
TTD 7.867537
TWD 37.119883
TZS 3002.793635
UAH 50.722498
UGX 4317.890035
USD 1.15938
UYU 47.11444
UZS 14144.435668
VES 548.763749
VND 30532.271126
VUV 139.408472
WST 3.220425
XAF 655.501836
XAG 0.015358
XAU 0.000242
XCD 3.133282
XCG 2.088923
XDR 0.824264
XOF 654.469842
XPF 119.331742
YER 276.657015
ZAR 19.492823
ZMK 10435.815284
ZMW 22.34239
ZWL 373.319873
  • RBGPF

    -13.5000

    69

    -19.57%

  • BCC

    -0.7700

    75.08

    -1.03%

  • JRI

    0.2200

    12.52

    +1.76%

  • CMSD

    0.0500

    22.15

    +0.23%

  • NGG

    2.2400

    86.84

    +2.58%

  • RIO

    1.5200

    94.81

    +1.6%

  • RELX

    0.0800

    33.23

    +0.24%

  • GSK

    0.8000

    55.99

    +1.43%

  • BCE

    0.1400

    25.38

    +0.55%

  • CMSC

    0.0900

    21.99

    +0.41%

  • BTI

    -0.5800

    57.89

    -1%

  • RYCEF

    0.5500

    15.64

    +3.52%

  • VOD

    0.1100

    15.13

    +0.73%

  • AZN

    3.5100

    200.73

    +1.75%

  • BP

    -0.8300

    46.17

    -1.8%

Inner workings of AI an enigma - even to its creators
Inner workings of AI an enigma - even to its creators / Photo: Kirill KUDRYAVTSEV - AFP

Inner workings of AI an enigma - even to its creators

Even the greatest human minds building generative artificial intelligence that is poised to change the world admit they do not comprehend how digital minds think.

Text size:

"People outside the field are often surprised and alarmed to learn that we do not understand how our own AI creations work," Anthropic co-founder Dario Amodei wrote in an essay posted online in April.

"This lack of understanding is essentially unprecedented in the history of technology."

Unlike traditional software programs that follow pre-ordained paths of logic dictated by programmers, generative AI (gen AI) models are trained to find their own way to success once prompted.

In a recent podcast Chris Olah, who was part of ChatGPT-maker OpenAI before joining Anthropic, described gen AI as "scaffolding" on which circuits grow.

Olah is considered an authority in so-called mechanistic interpretability, a method of reverse engineering AI models to figure out how they work.

This science, born about a decade ago, seeks to determine exactly how AI gets from a query to an answer.

"Grasping the entirety of a large language model is an incredibly ambitious task," said Neel Nanda, a senior research scientist at the Google DeepMind AI lab.

It was "somewhat analogous to trying to fully understand the human brain," Nanda added to AFP, noting neuroscientists have yet to succeed on that front.

Delving into digital minds to understand their inner workings has gone from a little-known field just a few years ago to being a hot area of academic study.

"Students are very much attracted to it because they perceive the impact that it can have," said Boston University computer science professor Mark Crovella.

The area of study is also gaining traction due to its potential to make gen AI even more powerful, and because peering into digital brains can be intellectually exciting, the professor added.

- Keeping AI honest -

Mechanistic interpretability involves studying not just results served up by gen AI but scrutinizing calculations performed while the technology mulls queries, according to Crovella.

"You could look into the model...observe the computations that are being performed and try to understand those," the professor explained.

Startup Goodfire uses AI software capable of representing data in the form of reasoning steps to better understand gen AI processing and correct errors.

The tool is also intended to prevent gen AI models from being used maliciously or from deciding on their own to deceive humans about what they are up to.

"It does feel like a race against time to get there before we implement extremely intelligent AI models into the world with no understanding of how they work," said Goodfire chief executive Eric Ho.

In his essay, Amodei said recent progress has made him optimistic that the key to fully deciphering AI will be found within two years.

"I agree that by 2027, we could have interpretability that reliably detects model biases and harmful intentions," said Auburn University associate professor Anh Nguyen.

According to Boston University's Crovella, researchers can already access representations of every digital neuron in AI brains.

"Unlike the human brain, we actually have the equivalent of every neuron instrumented inside these models", the academic said. "Everything that happens inside the model is fully known to us. It's a question of discovering the right way to interrogate that."

Harnessing the inner workings of gen AI minds could clear the way for its adoption in areas where tiny errors can have dramatic consequences, like national security, Amodei said.

For Nanda, better understanding what gen AI is doing could also catapult human discoveries, much like DeepMind's chess-playing AI, AlphaZero, revealed entirely new chess moves that none of the grand masters had ever thought about.

Properly understood, a gen AI model with a stamp of reliability would grab competitive advantage in the market.

Such a breakthrough by a US company would also be a win for the nation in its technology rivalry with China.

"Powerful AI will shape humanity's destiny," Amodei wrote.

"We deserve to understand our own creations before they radically transform our economy, our lives, and our future."

W.O.Ludwig--NZN