DOC: AI-Gen examples ctypeslib.as_ctypes_types #26827

otieno-juma · 2024-07-02T15:24:53Z

I used AI Llama 3 to help create these.
@bmwoodruff and I reviewed them.
[skip actions] [skip azp] [skip cirrus]

matthew-brett · 2024-07-02T15:27:28Z

Is now the time to ask whether we have a licensing concern about AI-generated code (for example - that it will pick up GPL-3 content)?

bmwoodruff · 2024-07-03T18:05:29Z

I think that's a great question to bring up now. We wanted to make sure anything AI generated was tagged appropriately, so that it's easy to remove if it becomes a problem at some point.

rgommers · 2024-07-07T07:57:50Z

It would be good to explain how this PR was put together. E.g.:

What was the LLM model prompt you used? And did this come out on the first try, or did you guide it to construct a more relevant example than it could do by itself?
Did you use a code search tool to see if this example came from somewhere else verbatim?

rgommers · 2024-07-07T08:00:26Z

Logistical comment: please use [docs only] in your commit messages for docs-only PRs, as documenting in the contributing guide, to avoid running a lot of unnecessary CI jobs.

bmwoodruff · 2024-07-07T18:24:37Z

please use [docs only] in your commit messages

We've been using [skip actions] [skip azp] [skip cirrus] while we wait for #26316 to be complete. I double checked the contributing guide to see if I'd missed something. I've been watching for [docs only] to be merged in, and will swap when it's ready. If using [skip actions] [skip azp] [skip cirrus] is incorrect till then, please let me know. That's easy to fix.

I'm pretty sure the current failed tests are related to a different issue which was fixed right before the 5th PR was contributed. I haven't wanted to push an empty commit yet, with the right tags, to run the docs only tests till after a discussion at the triage meeting (and enough time has lapsed for all to weigh in on the mailing list).

We can also make sure to combine examples from several functions into one PR to help reduce CI costs. We could combine all 5 of the currently submitted PRs into one, if you think that's appropriate. I think grouping 3 functions into one PR is what the recommendation was from the last triage meeting. I gave @otieno-juma the option of submitting the PRs he'd already had ready for over a week, or merging them into one before submitting, and he chose to submit each individually.

explain how this PR was put together

The explanation is currently fully visible, yet scattered across many commits and issues at https://github.com/possee-org/genai-numpy/. I'll put together an organized short cohesive narrative early this week. When it's ready, I'll share a link here and post to the ongoing discussion on the mailing list.

eagunn · 2024-07-08T18:37:18Z

I do understand and appreciate the larger issues about AI-generated code and copyright for the codebase as a whole.

However, the code chunks involved in all these related PRs are 3-4 line snippets within the docstrings for a function, intended to demonstrate the use of that function. These should be canonical examples; what any of us who teach coding might put in our lecture notes or type out during a demonstration in front of a class. If they did NOT match other code available out on the net, either character for character or within a few characters, I'd be more concerned.

Meanwhile, I hope that @otieno-juma, @bmwoodruff, and the rest of the POSSE (sp?) team are getting the basic content feedback that they need. Are the variable names clear enough? Should there be comments within the example? Should strings such as 'i4' be assigned to a variable with a meaningful name before being used so it is clearer what role that value is playing in the example. Unfortunately, I can't do any of that for this example because I, frankly, have no idea what the code is doing.

charris · 2024-07-14T16:10:50Z

However, the code chunks involved in all these related PRs are 3-4 line snippets within the docstrings for a function

Agree that licensing problems are unlikely here, I think it would be fine to review these like any other examples.

bmwoodruff · 2024-07-17T18:27:02Z

Here's a write up of the process we used to create the examples:

Narrative: Process used to create Gen-AI examples possee-org/genai-numpy#124

rgommers · 2024-07-17T20:14:28Z

It seems okay to review and merge this, given the process of putting together this PR that @bmwoodruff posted above. It's in particular good to see that the generated code was:

picked by hand from a much larger set of potential examples, and judged as suitable
optimized for matching existing numpy examples
edited by hand to ensure correctly of outputs

charris · 2024-07-18T01:19:45Z

close/reopen

charris · 2024-07-18T02:58:30Z

I think this needs a rebase on current main to fix the circleci failure, There was an update to the .circleci/config.yml on 6/13/2024. A simple git rebase main should work from your branch after main is updated from the github repo here.

@bmwoodruff

I used AI Llama 3 to help create these. @bmwoodruff and I reviewed them. [skip actions] [skip azp] [skip cirrus]

charris · 2024-07-18T20:45:07Z

Thanks @otieno-juma .

ngoldbaum · 2024-07-19T19:16:30Z

Hi all, it looks like this broke the "benchmarks" CI, which runs the doctests:

=================================== FAILURES ===================================
___________________ [doctest] numpy.ctypeslib.as_ctypes_type ___________________
499           `ctypes.Structure`\ s
500         - insert padding fields
501 
502         Examples
503         --------
504         Converting a simple dtype:
505 
506         >>> dt = np.dtype('i4')
507         >>> ctype = np.ctypeslib.as_ctypes_type(dt)
508         >>> ctype
Expected:
    <class 'ctypes.c_int32'>
Got:
    <class 'ctypes.c_int'>

We recently updated how the doctests are running, please make sure that they run against PRs that touch docstrings. Just running the circleCI doc builder is not sufficient right now.

#26989 has a fix.

ngoldbaum · 2024-07-19T20:21:03Z

Given the AI hallucinated some of the reprs and other doctesting output, I'd also appreciate it if you could make sure that the real output matches what the AI expects.

There may very well be cases where our reprs could be improved, but we shouldn't add reprs that don't yet exist to the docs before we change them to something nicer.

In this case the repr of the ctypes.Structure subclass that numpy dynamically defines is not ctypes.Structure but is instead just struct because that's the name we give it when we dynamically create the type. This would have been caught if a human had verified that the doctest output the AI generates matches the real output from the library.

bmwoodruff · 2024-07-19T20:32:03Z

The tests we ran were all before the recent update to the tester. I'll take a look at the code we're using, and update things appropriately. The output of most of the AI generated scripts is hallucinated, so i wrote a script to strip all output, and then insert appropriate output afterwards. The tests passed before the recent change. I'll look into it.

github-actions bot added the 04 - Documentation label Jul 2, 2024

eagunn mentioned this pull request Jul 4, 2024

DOC: AI generated examples for ma.reshape #26830

Merged

This was referenced Jul 7, 2024

DOC: AI generated examples for ma.left_shift. #26828

Merged

DOC: AI-Gen examples for ma.put #26829

Merged

DOC: AI generated examples for ma.correlate. #26831

Merged

charris closed this Jul 18, 2024

charris reopened this Jul 18, 2024

DOC: AI-Gen examples ctypeslib.as_ctypes_types

0d8832e

I used AI Llama 3 to help create these. @bmwoodruff and I reviewed them. [skip actions] [skip azp] [skip cirrus]

otieno-juma force-pushed the ai-examples-ctypeslib-as-ctypes branch from 6694c6a to 0d8832e Compare July 18, 2024 17:51

charris merged commit c8ed7e9 into numpy:main Jul 18, 2024
4 checks passed

bmwoodruff mentioned this pull request Jul 20, 2024

DOC: fix ctypes example #26989

Merged

Uh oh!

DOC: AI-Gen examples ctypeslib.as_ctypes_types #26827

DOC: AI-Gen examples ctypeslib.as_ctypes_types #26827

Uh oh!

Conversation

otieno-juma commented Jul 2, 2024

Uh oh!

matthew-brett commented Jul 2, 2024

Uh oh!

bmwoodruff commented Jul 3, 2024

Uh oh!

rgommers commented Jul 7, 2024

Uh oh!

rgommers commented Jul 7, 2024

Uh oh!

bmwoodruff commented Jul 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eagunn commented Jul 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

charris commented Jul 14, 2024

Uh oh!

bmwoodruff commented Jul 17, 2024

Uh oh!

rgommers commented Jul 17, 2024

Uh oh!

charris commented Jul 18, 2024

Uh oh!

charris commented Jul 18, 2024

Uh oh!

Uh oh!

charris commented Jul 18, 2024

Uh oh!

ngoldbaum commented Jul 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ngoldbaum commented Jul 19, 2024

Uh oh!

bmwoodruff commented Jul 19, 2024

Uh oh!

Uh oh!

bmwoodruff commented Jul 7, 2024 •

edited

Loading

eagunn commented Jul 8, 2024 •

edited

Loading

ngoldbaum commented Jul 19, 2024 •

edited

Loading