Sunday, August 16, 2015

Artificial Intelligence Weirdness. Need categorizing relate to visual cues?

... humanity shouldn’t assume our machines think as we do. Neural nets sometimes think differently. And we don’t really know how or why. -- David Berreby[1]

New Memories Never Formed. I have a neighbor in my retirement community[2] who had a stroke three years ago that seems to have impaired her ability to retain recent memories. I have conversed with her on several occasions. On each new occasion we meet she asks me, “Who are you? Have you been here long?”

When I bring an old (pre-stroke) friend of hers with me to visit her, see recognizes the friend immediately and then asks him or her, “Who is this person you brought with you?” And so it continues, although I do not dress or look different so far as any others in my community are concerned.

Read Some Plato! So what’s new? No recall, no recognition. No recognition, no identification! Even Plato knew that! And wrote about it, too. Read up!

Not so fast! That’s a quick response, but an inaccurate one. It’s only part of the story, because recognition needn’t depend on recall. An example: however hard you looked, you can’t recognize (in 2015) the 46th President of the United States. Why? Because he or she hadn’t been elected and sworn in , even though he or she, as a person, existed then. There was not even a likely candidate for the position down the road to guess about.

But once someone is sworn in on January 20th, 2025 as POTUS, you will, little doubt, -- barring a brain stroke -- recognize him or her as such, whether or not you have seen him or her in person. [3]. Such recognition requires neither memory (the first time) nor visible presence -- eg. a TV picture will do.

Weirdly Inhuman Artificial Intelligence. David Berreby in his article "Artificial Intelligence is Already Weirdly Inhuman gives some surprising examples of how neural-net computers consistently "miscategorize" visual input data. Two almost identical (to humans) pictures of a dog are classified as a dog and a giraffe. Two relatively random patterns of speckles, discernably quite different to humans, are categorized as a starfish and a cheetah. Berreby comments that no one has yet an explanation to offer for these results.

It is important to understand that not even the neural network computers jump directly from light-input-patterns to category-outputs. The “visual” inputs are processed through various algorithms which, depending upon certain conditions, yield categorical outputs. As the examples of the dog, giraffe, starfish and cheetah show, there appears to be something going on in either the inputs, or the algorithmic processing that does not “replicate (?)” or “parallel(?)” the human processes. (We can’t even assume there are some well-defined isometrical relations[3B] between the human and the computer processes: thus the bracketed question-marks. -- my comment, not Berreby’s.)

Let's check some possible ways of dealing with this “weirdness.” Are the computer visual-inputs sensitive to light frequencies that human eyes are not? Are there internal interference phenomena that are different in humans than in computers? Clearly there are algorithms (actually, heuristics) that humans learn to use for identification that might not yet be available to computers, especially those that depend upon conditions of social interaction or on multisensory input coordination. (This is difficult stuff. I really don’t begrudge AI researchers the personification of their apparatus, i.e. as thinking, seeing, etc., if it keeps up their enthusiasm.)

Some Human Identification Algorithms. For many, many English categories of objects that are visually discernibly different, we have hosts of subcategories. These subcategories enable us to practically recognize two objects as “the same,” i. e. “time-pieces, ” which have quite different appearance and functioning parts, e.g. a sundial, an hour-glass, and IPad. Meta-categories such as “purpose” or “typical use” and the like help us to sort-out discernably different time pieces into practical categories.

It’s important to note that “recognition” is not a unique process. We use the term to indicate both
a. our experience of visual -- more generally, sensory -- familiarity based on previous experience, i.e. recall; and,

b. an act of participation in a social practice of equal treatment, i.e. acknowledgement: this thing shall be treated, under certain circumstances, as a “time piece.”
(See Two Senses of 'Recognize'.)

It is important to understand that categorization for adult humans, except for the most basic uses, e.g. learning to identify paradigmatic examples, is not based substantially on recall, but on processes that establish recognition-equivalence.[4].

This requires one or more meta-sets of algorithms (or heuristics) that are called into play to work on items pre-processed by lower level algorithms.

The following categories, which could be used as algorithm names, are meta-set indicators: aging, larval, decrepitating, disguised, worn-out, broken, in-process, unintentional, illegal, shrunken, decayed, inebriated, etc. Paradigm objects subjected to the algorithmic meta-set processes, may be identified as recognition-equivalent to the paradigm members despite substantial deviation in appearance. [5].

Addendum 8/15/16: for an interesting new article see AI's Language Problem MIT Technology Review 8/9/16

--- EGR


[1] Berreby, D Nautilus 8/8/2015 Artificial Intelligence is Already Weirdly Inhuman

Many AI enthusiasts use the verb, “think,” somewhat over-enthusiastically -- perhaps for promotional purposes. But is this only a stretched metaphor? Why is it important to stretch it? Do computers “think” in any way recognizably as do humans, or animals? For some discussion on this issue see "Thinking" Like Computers Do

[2] See Foulkeways at Gwynedd

[3] see Part 3: Recognition and Knowing

[3B] see Isomorphism: Program, Structure, and Process -- a catalog

[4] See Recognition-Equivalence

[5] See Dimensions of Individuation. Such dimensions can be used to identify the algorithms applicable to the superset of individuals under consideration. So, for example, when considering a professional football team, we might ask How can we tell the half-backs from the full-backs? Or, How can we identify individual contract-holders?

Sunday, August 9, 2015

Charades of Evaluation: mis-connecting cause and effect

updated 4/23/19
"…passing a failing student is the #1 worst thing a teacher can do. … Changing grades is the most undermining contribution to a student’s failure, but above all else – it invalidates your data. Putting aside creating and submitting inaccurate school data for the moment, entering a 'false grade' will make it virtually impossible to reliably measure any improvement of your skills as a teacher. Your improvement will now be based on unsound and worthless data." -- M. Cubbin (8/2/15) The Business of School

"… Is the Customer Always Right?" -- Farrington, Frank (1915) in Merck Report, Volume 24 pg 134-135
Pseudo-Evaluation. Data are not fundamental. On their face, data cannot be distinguished from outputs of a random number generator. Data are like shell chips on a beach mixed in with sand; or, like foam on the tide. (For an article relating “data” and “objectivity,” see Can Criminal or Immoral Behavior Be Dealt With Objectively?)

Far more important to know is what the processes are by which the putative data are collected. And even more critical is knowing which theories connect the data and collection process to what supposedly they indicate.

Much “data-collection” is like trying to identify proportions of bird-species during a fall migration. If the process were merely to tally varieties of southward flying objects, we might well end up confounding red-winged blackbirds with jet planes and monarch butterflies. (See Is It Really a Test? Or Just Another Task?)

In the above epigraph, Cubbin presumes a connection, presumably ideally possible, between school grades, student failure, and teacher skill. Using teachers as graders cuts costs, but is begging for inconsistency: not, because teachers may not “know their stuff.” But, because many institutional processes can overrule even the best of teachers’ judgments, e.g. administors’ prerogative, special education policies, or political involvement in the grading process. There is little consensus, when interest-group push comes to shove, on either goals or concepts appropriate to education. (See What Does a Consensus Mean, Anyway?" )

Requiring teachers to grade then grading the teachers is like judging baseball coaches using their players' batting averages. There will likely be only a tenuous, if any, connection between the data and any causal relationship to coaching (teaching) efforts. (See Power Failure: Losing the Series; Blaming the Bat Boys )

The Diploma-Holder Markets: is the customer always right? An important assumption Cubbin seems to make is that markets for test-passers are comprised of persons looking for those who possess certain proven skills. This is only a minor proportion of the markets for certificate- or diploma-holders.

Consider these other markets for whom actual skill levels are a distant, if even, a second consideration to applicant grade-point average:
a. Colleges, public, private or commercial, who have external, e.g. federal, or foundational, funds available for applicants with a certain grade-point average -- especially if the recipient institutions have tight budgets;

b. Institutions legally required to have certified staff but faced with employee scarcities, e.g. hospitals, clinics, civil-service;

c. School districts needing both adults certified as teachers, and students with birth and health certificates, in order to be run at even somewhat remove from peak efficiency;

d. Government administrations pursuing certain public policy initiatives that depend on items a, b and c, preceding; e.g. special education, affirmative action, STEM (Science, Technology & Mathematics); and last but not least,

e. the children applicants, legacies, to colleges which favor (paying) parents who are past graduates.

If skills really counted, there would be something like board standard examinations to be passed; normally, to be retaken at standard intervals. Teacher grades would not be accepted in place of board exam results. (See The Dangers of Diplomas)

But where the sheepskin alone is most important, rarely will the sheep’s diet be.

For examples and to pursue the issues raised in this essay, see

1. “Data-Driven”: a slogan to distract from organizational disagreement?;

2. Classification Error in Evaluation Practice:
the impact of the "false positive" on educational practice and policy

--- EGR