12 July, 2023

Adventures in using language-based AI models to learn about botany

First off: Hello, world (this being is my first journal post and all).

All right. Now that’s out of the way, some general background information --

I’m a plant person. This happened slowly – I have always felt a deep respect for the land due to my upbringing – and then so fast I barely had to time to blink or catch my breath after a guided walk through a prairie remnant irrevocably altered the way I saw the land around me.

There was me before, and me after.

The me after is the one who uses iNaturalist – the one who goes on long walks to observe and study and understand my surroundings; the same one who has been easing into community identifications as I gain more knowledge.

Say we go to a nearby piece of wild-looking land together. For the most part, I can name the plants for you. I can read last week’s weather based on today’s blooms; I can tell you what kind of soil exists beneath our feet. I can point out what’s native and what’s not; I can tell you how the land was treated in the past; if it was plowed or seeded or simply left alone.

And yet – ask me how I know which plant is which or bring up the stigma or style or stamen and my mind goes blank as I try to find the right words to respond.

A trained botanist I am not.

Lately I’ve been getting into dichotomous keys. I want to have the correct words to describe things; I also want to understand the more technical descriptions I run across when interacting with others with more classical training.

Textbook learning has never been my forte. And with botanical descriptions, there is no shortage of words and phrases to learn and memorize.

This is where language-based AI models have come in handy for me.

Truth: I’ve been a reluctant adopter. As a person who occasionally creates art professionally, I worry about how these tools are infringing on intellectual property. Even more, I worry about the diminishment of new innovations in favor of an echo chamber of regurgitated ideas – an Ouroboros of humans using AI to do tasks trained on recirculated data provided by humans, if you will.

One night, not too long ago, I was running through a dichotomous key about Tragia ssp. One of the determinate factors involves the size of the persistent base of the staminate pedicel in relation to its subtending bract.

Perhaps you can read this and understand what it means – but I could not. And search engines were not much help either. Search engines do well with common terms and broad queries; this was too specific.

After trying to find an online resource that could help, I decided to give AI a spin. I’d been using it at work to sort and manipulate data with good results; I’d even used it to write a few PowerShell scripts for me that would have taken me considerably longer to code by hand. I’d gotten the hang of how to converse with it too. Questions need to be stated clearly; extra context for subject matter helps, for example.

Hello, I typed. I am trying to understand a particular phrase as it pertains to a botanical description. Can you help me understand what this means: persistent base of a staminate pedicel.

Here’s the thing – it did really well. The description was clear and easy to understand.

Thank you, that’s helpful. The description also mentions a “subtending bract.” Can you describe what that means as well?

Once again, the answer was clear enough for me to understand without a strong background in biology.

Great, I typed. Let’s return to the earlier phrase, the persistent base of a staminate pedicel. To help me visualize this, can you provide an example of a plant from the Blackland Prairie eco-region that exhibits this characteristic?

Once again, the response was phenomenally useful. The response named a species I was familiar with and suddenly I could visualize the technical term quite well.

A traditional search engine simply can’t provide this level of interaction.

Humans can – though even the most generous humans have limits with their time and patience; I don’t know anybody who wants to be on call as my phone-a-friend each time I look at a dichotomous key and need help understanding something – which is basically each time I read a new sentence.

I’ve run this same experiment past various AI tools. Overall, they do well breaking down the technical language. They’ve also been able to successfully help me visually the descriptions by providing additional examples of species based on my eco-region.

They’re also a handy way to learn more about scientific names.

Hello, I am studying botany. I am trying to understand whether the Latin epithet “strepens” has any particular meaning. What can you tell me about it?

The answer is fast and easy to understand.

Thank you, that’s helpful, I type. Can you tell me if any words in modern English are derived from the Latin root?

Another set of answer and examples – and suddenly I can connect the meaning of the epithet to modern words (strepens and obstreperous, for example), which will help me retain the meaning of the epithet, which will help me the next time I run across this epithet.

Ramosa, virgatum, urticifolia.

And so forth.

Of course there are limits.

On several occasions I’ve tried to forego the dichotomous key and see whether AI can tell me how to distinguish between two species. This is where it begins to falter, although AI does not know it’s faltering.

I receive responses with definitive answers that aren’t correct, or are partially correct. Language-based models are trained on a wide array of texts; it rehashes information cobbled together from dense resources; things get mixed up.

A flower is mentioned as always white when that’s not the case. A length of a corolla is given that’s considerably different from what it should be. When I ask for a source, the AI model apologies and says it actually isn’t certain of the information. It says it cannot provide a source and that it's sorry it provided me with inaccurate details.

Note that I didn't state it was wrong - it just realized on its own that it was not accurate when I wanted a source.

Another time, I ask for a source when the information is correct – just to see what will happen. It does not apologize for being incorrect and provides me with a reputable source.

Yet another day, when I ask for a source, I’m told it’s not possible to ever give me a source because it simply doesn’t provide sources. This makes me wonder whether it was an anomaly for me to have previously received an actual source.

Most recently, I’ve been curious how AI tools do with general information about species.

What can you tell me about Paspalum pubiflorum?

I receive a good answer; I ask about Paspalum dilatatum. Another good answer. The information is sound and accurate.

Then I ask about Paspalum langei – and receive an answer that is mostly incorrect. They aren’t made up, exactly; they just seem to be about another Paspalum species.

There is value and promise here; there is also danger here.

Despite the issues, I’m going to continue on this journey in places where it makes sense. Can AI help me with botanical descriptions,? Yes. Can I use it to gain insight about scientific names and to connect Latin words to modern English? You bet.

Should I use it to gather general information about species? Or discern how two species are different? No - probably not, based on my experiences. Not at this juncture.

And that leads me to this-- as I have a pressing question...

...Hello, I’m trying to understand a particular phrase as it pertains to botany. Can you describe what “stigmatic surfaces not papillate” means?

