Human in the loop #1306

remi-braun · 2026-04-29T08:07:27Z

remi-braun
Apr 29, 2026

Hello,

I must say I am a bit surprised by the way this library is now evolving.
xarray-spatial was stalled for a couple of years and now it seems development has become crazy, with a multitude of new features appearing almost very day.

You kindly added the Claude repository, so you 100% assume the extensive use generative AI, thanks for being so transparent. However, with this coding pace I must ask: is there still a human in the loop in xarray-spatial or is this lib has become a testing repo for Claude's ability in geospatial?
Moreover, other questions are emerging:

What is the point of recoding everything here, moreover on your own? What is exactly the scope of this library?
Don't you create an illusion of completeness?
What about fragmentation?
What about the bus number of this library ? Aren't you scared it is now less than 1 as you delegate most of your coding to Claude?
Do you have any safeguard to avoid these issues?

Best Regards,

Rémi

brendancol · 2026-05-01T04:17:05Z

brendancol
May 1, 2026
Maintainer

@remi-braun thanks for the comments, questions, and frankly for noticing the changes to xarray-spatial. Happy to answer them one by one. Also, thanks for your thoughtful comments on the hillshade discrepancies from a while back on issue #748 .

I don't mean to distract... but let's start with another question first...why did it take so long for me / others to fix issue #748 you were involved in? The issue was filed in 2022 by @thuydotm, 2023 I was triaging issues but something must have distracted...2024 you posted an excellent comment...2026 fixes were made, partly using help from Claude and we merged a fix. This is NOT an ideal timeline for problem to solution to release!!

The reason to why such a timeline, like you pointed out, is that xarray-spatial stalled for a while. It stalled because I personally was unable to support the project due a few things:

In 2021, xarray-spatial received some funding from a larger company and an aggressive scope that I was optimistic my company's team could complete. We got a lot of good stuff done, but delivered subpar distributed reprojection using dask.arrays. Small, inmemory stuff was easy...but basically irrelevant...and we had accuracy trouble on the larger-than-memory outputs. We lost that funding and I was eventually forced to downsize the team. I had designed the company to be profitable without investors but I found myself with jagged revenue and trouble making payroll. Those failures all resulted from my decisions, including the staff hiring, hours estimates, etc.
In late 2022, 6 weeks before issue Discrepancy between xarray-spatial hillshade and GDAL hillshade #748 was filed, my first child was born. My pattern of open source contribution at night and on weekends was no longer "temporally viable".
I was generally depressed about the state of xarray-spatial, and the failed engagement from 2021. Opening the project would give me anxiety. I was still allocating what I could to support the work, but issues simply weren't getting completed or released. Any time I had was allocated to making sure payroll could be met for remaining employees by doing paid work for clients...mostly closed source.

This gives you a bit of context about my perspective on why the project stalled. Let me offer a personal apology that your contribution to the project wasn't better integrated and integrated sooner.

You kindly added the Claude repository, so you 100% assume the extensive use generative AI, thanks for being so transparent.

You're welcome, but I added the Claude files so I myself could share them on the different machines for testing, not transparency. It's also nice that other contributors can use them and I was also excited about hearing the military / US government banned Claude...

is there still a human in the loop in xarray-spatial or is this lib has become a testing repo for Claude's ability in geospatial?

Yes, there is a human in the loop, and it's me. Nope, xarray-spatial is not a "testing repo"...although any software endeavor is going to "test" the ability of its contributors, human or agent, to accomplish goals in its given domain.

What is the point of recoding everything here, moreover on your own?

Is it fair if I rephrased the question as: "Why reimplement things that already have implementations folks use from Python and where do you find the motivation to so?"

I wanted a geospatial stack in Python that doesn't involve GDAL because I have personally found GDAL difficult to install and extend. The world owes GDAL a huge "thank you", and I'm not saying that "write once, and wrap for all other languages" is a bad idea...but I wanted a library for geospatial python that didn't delegate heavy lifting to C/C++ extensions. I wanted a toolbox of geospatial "ufuncs" that didn't create new data structures (e.g. class MyNewRasterContainer(): pass) and I wanted it written using technologies I was excited about around 2018 or 2019 (xarray, numba, dask and later cupy). A few years before, I'd been working on datashader and always wanted more GIS-focused aggregation functions, and it was clear that a separate library would be better that bulking out datashader with too much geo.

Let's be clear, there are redundancies that can't simply be explained by, "I don't like GDAL". xrspatial.hydro overlaps with pysheds...a great library with numba integration, but lacks focus on xarray, cupy or dask. Mapclassify is awesome, but focuses more on vector data than raster...and then there are many libraries that do similar things, but I don't like their spellings / interfaces. Zonal tools would be a good example of that.

There may be many other popular tools I'm unfamiliar with that make features of xarray-spatial redundant, but its not my goal to duplicate for duplication's sake.

To address your comment, "...moreover on your own"...yes its hard. I don't have institutional backing and I contribute a lot of unpaid personal time, but I do love it again now and want to increase momentum.

What is exactly the scope of this library?

There is not "exact" scope, but I do have a some guiding rules:

No GDAL or GDAL wrappers allowed (e.g. no rasterio, rioxarray, geocube, etc.). Again, great projects...no beef, but this library passes on "goodle".
Numba for heavy lifting where needed.
Dask for horizonal scaling
Generally, no new data structures for folks to learn...mostly just xarray.DataArrays and xarray.Datasets as inputs and outputs...
The focus is geospatial raster processing, not geospatial vector processing, so we support rasterize and polygonize, but not trying to replace cuSpatial, geopandas, etc.

Don't you create an illusion of completeness?

I don't fully understand this question. Do you feel the library is asserts a claim of "geospatial completeness" or just "geospatial raster completeness"? My hope is that neither because we haven't even made it to 1.0.0 release yet. There is no stable API yet for xarray-spatial. I hope to complete a 1.0.0 release when all existing tools are fully validated for accuracy, performance, security and backend array parity (numpy, dask, cupy, dask+cupy).

What about fragmentation?

Do you mean within the geospatial ecosystem, python, or xarray more specifically?

What about the bus number of this library? Aren't you scared it is now less than 1 as you delegate most of your coding to Claude?

I'm not familiar with "bus number". Happy to learn though if you can provide more details. Claude doesn't scare me, but the issue #748 timeline does scare me. Core developers of popular libraries deciding to have more well rounded lives scare me too.

Do you have any safeguard to avoid these issues?

Not sure about "these issues", but the last few releases of xarray-spatial have been focused on security updates and accuracy updates. Each day I usually am able to run a few modules through security, performance, or accuracy sweeps. I've been finding many CRITICAL, and HIGH category bugs and fixing as fast as possible. Most of the recent frenetic pace has been fixing performance and security bugs...but there are many new features in 2026, and my hope is that "going fast fixes going fast".

0 replies

remi-braun · 2026-05-06T09:31:48Z

remi-braun
May 6, 2026
Author

Hello,

First of all, thanks for the reply 😄

I totally understand your struggles and lack of time to maintain the library. Open source is far from easy and relies too many times on the maintainer's personal dedication. This discussion really doesn't aim to destabilise you or whatever. However, as AI usage grows, I think these practical and ethical discussions needs to be done.

Disclaimer: this will be a bit long; English is not my native language and this answer has not passed through an AI filter. 100% human here 😉

Context

I had time to think, and maybe I have to explain my biggest discomforts with the new development of this library to add some context. I won't say that AI is not helping us all a lot finding bugs, etc. The issues I want to talk about here are elsewhere.

First of all, and maybe the biggest, we don't know any more who is @brendancol on GitHub. Sometimes, like here, it's you in person (I assume, but I'm not even sure), sometimes it's 100% Claude (in PRs etc), and sometimes there is a grey area, like the answer to my issue. There is trust in human interaction. When this trust is broken, it's really hard to build again. I would encourage you to keep your account but that any AI vibe coded thing should be contributed under another account, just to dissipate this grey zone. IMO, it is extremely uncomfortable. You state there is still a human in the loop but how can I be sure when your account mixes everything?
There is an implicit trust behind a library (the stars, the xarray official backing, etc.) and often an implicit governance. Here it seems everything has changes without any warning, not in the ReadMe or in the discussion section, taking that trust legacy as granted. I think you should explain all this so people can be aware of the policy's change.
This AI generated code builds a wall between this library and any potential contributor: who would want to contribute to a library maintained by an AI? And IMO, this goes against the open-source spirit of community
There is maybe a distortion of concurrence here, with this much use of AI. You are building extremely fast a library that aims to do everything. But you are on alone building it, with your own (or AI's ?) architecture choices, not questioned by anybody. This raises a lot of concerns for me. If people are choosing your lib because the many features you expose made them believe a whole team was behind it, that the lib will live forever, they can quickly be deceived, especially after highlighting your lack of time in your precedent message.

Discussion

And now, back to the discussion we had.

I totally understand your need of a GDAL-free geospatial library. However, this task may look like a Promethean one, right? I think the right way of achieving this should to call to the community and ally with others, not maintain the entire ecosystem on you own. Even less relying so much on an AI. This is fragmentation, as there is for sure other people elsewhere that have the same goal as yourself, maybe building things that are already recognized, but you are like bypassing them using AI.

This is where the bus factor also intervene. This concept, popular in open source, measures the "minimum number of team members that have to suddenly disappear from a project before the project stalls due to lack of knowledgeable or competent personnel". As you highlighted, it was more than one in the beginning when you had funds, decreased to one when you lost it. But now, it is even less than one, as you cannot state you still 100% own this library: there has been too many changes for a human to carefully follow. I suppose that you don't have suddenly more time to dedicate to this library, right? So how come using this much any generative AI tool can ensure this library is properly maintained? AI should be a copilot, not an autopilot.

You could say, OK but my library is very well tested, which seems true, so everything AI does is working. I wouldn't agree. In your tests, I don't see any real-world examples (at least in flood and fire files), so how could I be sure your functions are working? I don't want to invest time and money to find out. These questions come from the fact that the trust implicitly existing between good willing humans has been broken. When a human created his library, it was legitimate to think the code is used elsewhere and tested in the real world, as people rarely do thing for the beauty of it. Now that it's AI-generated, I am not sure anymore: it could just be code for the sake of the ecosystem completion (like you seems to push for in the other discussion thread), without any real-world testing. This is an illusion of completeness: in your docs you do a lot of things, but I am not sure it lives up to the reality.

I am really skeptical about your statement: "going fast fixes going fast". I am more in favor of "take your time to go fast" 😄

Conclusion

Thanks for reading all this, now you know my position. Your library is very interesting for that and started a lot of discussions with colleagues and other people I know. This question it raises are very interesting ethically et philosophically 😉

0 replies

brendancol · 2026-05-06T18:39:42Z

brendancol
May 6, 2026
Maintainer

@remi-braun Thanks for follow up and I totally understand about the language barrier. Thanks for doing this in English. You present valid concerns. I think it will help both of us to avoid hyperbolic-ish language like (e.g. "crazy", "extremely", "aims to do everything", "vibe coded"). Let's calibrate our words and make progress here.

As an overall disclaimer: I'm a real human with actual constraints. Agentic workflows do change those constraints. I'm an open source project maintainer by self-selection. My opinions are fundamentally biased by my life experience.

First of all, and maybe the biggest, we don't know any more who is @brendancol on GitHub. Sometimes, like here, it's you in person (I assume, but I'm not even sure), sometimes it's 100% Claude (in PRs etc), and sometimes there is a grey area, like the answer to my issue. There is trust in human interaction. When this trust is broken, it's really hard to build again. I would encourage you to keep your account but that any AI vibe coded thing should be contributed under another account, just to dissipate this grey zone. IMO, it is extremely uncomfortable. You state there is still a human in the loop but how can I be sure when your account mixes everything?

When you use an LSP is that you? What if the LSP suggested an out-of-date function parameter? Could you please setup a separate github account for any code that used LSP autocomplete (joke)? When you use spell-checker, is that "you" anymore...even though you are falsely portraying an inflated level of spelling consistancy. Did my typo in "consistancy" enhance this conversation because you now know I have a tendency to swap "a" for "e" when typing that word?

I think we can agree that it is a matter of degree. Xarray-Spatial is moving too fast for your taste and you don't have the time or money to verify its claims. That is a fair position to take. I think one should verify the claims of libraries before deploying to prod and you personally may not have bandwidth for that. I think we should be careful not over rely on human interaction in the context of mostly deterministic and verifiable toolsets. Have tests, keep testing, assume it doesn't work and keep testing.

There is an implicit trust behind a library (the stars, the xarray official backing, etc.) and often an implicit governance. Here it seems everything has changes without any warning, not in the ReadMe or in the discussion section, taking that trust legacy as granted.

This project lives within xarray-contrib org, but I don't receive any "backing" from Xarray or any official endorsement. I'm not sure about "implicit governance". As for "everything has changes", there have been few public api changes to existing tools besides adding new tools. Is there a module in particular, which changed during the last few months, you can cite as an example to strengthen your point here?

I think you should explain all this so people can be aware of the policy's change.

What policy change are you referring to? Sound like you are imply that this project had an AI contribution policy that has now changed. It has never had such a policy, nor will I be spending time to invent one, similar to the way I didn't write the software LICENSE. Are there currently AI policies (that aren't your opinions) you can point to that you feel projects like xarray-spatial should adopt? I haven't been marketing new xarray-spatial features around (blogs, youtube videos, meetups, conferences) yet because I'm focused on first getting to a 1.0.0 release with the scope mentioned above.

This AI generated code builds a wall between this library and any potential contributor: who would want to contribute to a library maintained by an AI? And IMO, this goes against the open-source spirit of community

I think open source contribution has and is changing. Popular OS project were flooded with AI PRs, some honest enhancements, some dishonest attempts to gain community clout. I found this PR within the Dask repo particularly frustrating because what I personally want is for the Dask project to evolve quickly, potentially at the price of stability (or perceived stability). If I could have anything for Dask, it would be @mrocklin with unlimited tokens, a case of Le Croix, and no interruptions.

Xarray-Spatial on the other hand, had ZERO slop PRs during the slopacolypse because the library simply was not a relevant player people wanted to enhance or gain clout from contributing to. I find "open-source spirit of community" to be too vague to be helpful here, similar to the "community-driven" issue label in EOReader. What are the community dynamics within EOReader that leave so many community-driven issues unfixed for years? EOReader hasn't closed a single community-driven issue in 2026?! Prove me wrong by fixing the issues listed in the link.

There is maybe a distortion of concurrence here, with this much use of AI. You are building extremely fast a library that aims to do everything. But you are on alone building it, with your own (or AI's ?) architecture choices, not questioned by anybody. This raises a lot of concerns for me. If people are choosing your lib because the many features you expose made them believe a whole team was behind it, that the lib will live forever, they can quickly be deceived, especially after highlighting your lack of time in your precedent message.

Each day, I choose whether or not I try to push xarray-spatial forward, but I can't make the decision for others. The past two months, I essentially been a lone developer not explicitly by choice...I just don't have the active contributors the project once had. Do you feel that past contributors should not get any credit because it gives a false impression of a large team backing the project? EOReader has 14 contributors, some non-human, but from the outside it seems like it just you...wrapping rasterio and xarray. You have 1600+ commits, and the next contributor is non-human, and the next has 14 commits. Why is EOReader different from your characterization of xarray-spatial regarding team size or longevity?

I totally understand your need of a GDAL-free geospatial library. However, this task may look like a Promethean one, right?

Thanks for acknowledging a need for a GDAL-free geospatial library. To plant a flag in the sand, xarray-spatial is not trying to duplicate GDAL or replace it. I just want to not require GDAL for some of my own common use cases and workflows while working on "real world" projects...I'm not there yet.

I think the right way of achieving this should call to the community and ally with others, not maintain the entire ecosystem on you own.

I'm not maintaining an entire ecosystem, just a raster toolbox, and hopefully others would want to help, but my primary objective is finishing a 1.0.0 scope as mentioned above. I agree on engaging with communities and having allies is important, but I think you are wrong about there being a "right way". If any "right way" exists, its hiring experienced engineers and paying them well...like money...not github stars. That is difficult because the market for software engineers has been distorted by the hiring practices of larger companies. To get specific, do the practices inside EOReader exemplify the standards for open source maintenance? Am I wrong that EOReader is primarily your own work and NOT "community driven"? I say that because it doesn't seem to have consistent community contribution, but does have consistent releases.

Even less relying so much on an AI. This is fragmentation, as there is for sure other people elsewhere that have the same goal as yourself, maybe building things that are already recognized, but you are like bypassing them using AI.

Why do you call have a raster toolbox free of GDAL a "Promethean" task, while then saying alternatives already exists. If you didn't wrap GDAL (i.e rasterio) for EOReader, what would you wrap instead from geotiff IO?

This is where the bus factor also intervene. This concept, popular in open source, measures the "minimum number of team members that have to suddenly disappear from a project before the project stalls due to lack of knowledgeable or competent personnel". As you highlighted, it was more than one in the beginning when you had funds, decreased to one when you lost it. But now, it is even less than one, as you cannot state you still 100% own this library: there has been too many changes for a human to carefully follow. I suppose that you don't have suddenly more time to dedicate to this library, right? So how come using this much any generative AI tool can ensure this library is properly maintained? AI should be a copilot, not an autopilot.This is where the bus factor also intervene. This concept, popular in open source, measures the "minimum number of team members that have to suddenly disappear from a project before the project stalls due to lack of knowledgeable or competent personnel".

Ahh ok bus factor...how many people need to get run over by buses before a project can't be maintained...correct? I'm hoping to one day have a bus factor of NaN, but right now it is 1...because of my PyPI credentials I believe. That is a valid issue to raise and there has been discussion with xarray-contrib there.

As you highlighted, it was more than one in the beginning when you had funds, decreased to one when you lost it. But now, it is even less than one, as you cannot state you still 100% own this library: there has been too many changes for a human to carefully follow.

Who pays and "who owns what" is a good question. Ownership here is laid out in the LICENSE file...but more fundamentally, when people contribute to open source, how do they still afford the basics of life? Some folks are academics and open source is a hobby and they survive on the tuition of students or institutional endowments. Some contributors are independently wealthy. Some have private business backing them, and some are funded by tax payer money. This stuff is difficult. I don't currently have the funds to allocate more than my own time to xarray-spatial, but I would love help from folks who can share a vision.

I suppose that you don't have suddenly more time to dedicate to this library, right? So how come using this much any generative AI tool can ensure this library is properly maintained?

AI allows me to use smaller blocks of time in my fragmented schedule. In 2022, I needed large continuous blocks of uninterrupted time to get ANYTHING done. Being able to use a 25-minute block of time is a nonlinear productivity gain for me. Maybe you feel it's not true productivity. You feel differently about the tools and I respect your opinion.

AI should be a copilot, not an autopilot.

Fine. How do you know the difference unless you really have your thumb on the pulse of a project? We all still have upper bounds to what we can contribute. Some days maybe things are moving too fast. Some days nothing happens.

I am really skeptical about your statement: "going fast fixes going fast". I am more in favor of "take your time to go fast" 😄

I'm skeptical of my statement too...

I think it possible that as an industry, community, or whatever, we've lost the luxury of "slow is smooth, smooth is fast".

Sorry to call out EOReader again, but there are potential critical security bugs there that any malicious actor can discover, and weaponize on unsuspecting users of the library..., keyword "potential", because I myself don't have the necessary poisoned STAC catalog to exercise the vulnerability.

So I forked EOReader and at least documented it in my fork:

brendancol/eoreader#1
brendancol/eoreader#2

Do you think these issues are valid? They were discovered within 3 minutes of cloning eoreader and running a rudimentary security scan with Claude Code. Everyone's vulnerabilities on now on display for the world. Give a bored hacker, or a government hacker an evening of focus and they could craft the necessary poisoned STAC catalog to exploit EOReader users. You mention the impact on trust of using AI tools? What about community trust in maintainers to stay on the edge of available tools and improve their libraries to avoid such critical bugs?

Closing thought...

Any interest in seeing if we can collaborate on a few PRs? I don't have other folks in the project. If there is a concrete area of xarray-spatial you are interested in improving, I'm happy to focus there and/or review your code.

Sorry if my response is incomplete or doesn't address things you called out.

Thank you for putting time into this discussion. I find it motivating to connect with people, especially in other countries and wow what a beautiful place Strasbourg must be. I got to visit Colmar in 2017 after a GeoPython and loved it...

100% of my part of this discussion is free of AI...

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Human in the loop #1306

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 3 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Human in the loop #1306

Uh oh!

Uh oh!

remi-braun Apr 29, 2026

Replies: 3 comments

Uh oh!

Uh oh!

brendancol May 1, 2026 Maintainer

Uh oh!

Uh oh!

remi-braun May 6, 2026 Author

Context

Discussion

Conclusion

Uh oh!

Uh oh!

brendancol May 6, 2026 Maintainer

Closing thought...

remi-braun
Apr 29, 2026

brendancol
May 1, 2026
Maintainer

remi-braun
May 6, 2026
Author

brendancol
May 6, 2026
Maintainer