Skip to content

Commit 0c6cb9e

Browse files
authored
Merge pull request #820 from git/gsoc-2026-more-ideas
SoC-2026: add more ideas based on Christian's suggestions
2 parents a128927 + 79cb13d commit 0c6cb9e

File tree

1 file changed

+146
-7
lines changed

1 file changed

+146
-7
lines changed

SoC-2026-Ideas.md

Lines changed: 146 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -143,6 +143,144 @@ _Possible mentors_:
143143
* Siddharth Asthana < <siddharthasthana31@gmail.com> >
144144
* Lucas Seiki Oshiro < <lucasseikioshiro@gmail.com> >
145145

146+
### Improve disk space recovery for partial clones
147+
148+
Git's partial clone feature allows users to clone repositories without downloading
149+
all objects immediately, which is particularly useful for very large repositories.
150+
Objects are fetched on-demand from "promisor remotes" as needed. However, over time,
151+
clients may accumulate large local blobs that are no longer needed but remain on disk,
152+
and currently there's no easy way to reclaim this space.
153+
154+
This project aims to improve `git backfill` (or create a new command) to allow
155+
clients to remove large local blobs when they are available on a promisor remote.
156+
This would help users who want to get back disk space while maintaining the ability
157+
to re-fetch objects when needed.
158+
159+
The project involves:
160+
- Designing a safe mechanism to identify which blobs can be removed
161+
- Implementing the removal process while maintaining repository integrity
162+
- Ensuring removed objects can be transparently re-fetched when needed
163+
- Adding appropriate safeguards and user controls
164+
165+
**Important note:** While the project mentions `git backfill`, it is not yet
166+
decided that it is right place to have this command. Other potential candidates
167+
for placement are `git gc` / `git repack` / `git maintenance`. A design discussion
168+
with the community is imminent as part of this project to finalize the most
169+
appropriate placement and for this command.
170+
171+
**Getting started:** Build Git from source, set up a partial clone and experiment
172+
with promisor remotes, study the existing `git-backfill` command (if available)
173+
or related functionality, understand how Git tracks and fetches objects from
174+
promisor remotes, review documentation on partial clones in
175+
`Documentation/technical/partial-clone.txt`, and submit a micro-patch to
176+
demonstrate familiarity with the codebase.
177+
178+
**Resources:**
179+
- [Partial clone documentation](https://git-scm.com/docs/partial-clone)
180+
- [Git Protocol v2 documentation](https://git-scm.com/docs/gitprotocol-v2)
181+
182+
_Expected Project Size_: 175 hours or 350 hours
183+
184+
_Difficulty_: Medium to Hard
185+
186+
_Languages_: C, shell(bash)
187+
188+
_Possible mentors_:
189+
190+
* Christian Couder < <christian.couder@gmail.com> >
191+
* Karthik Nayak < <karthik.188@gmail.com> >
192+
* Justin Tobler < <jltobler@gmail.com> >
193+
* Siddharth Asthana < <siddharthasthana31@gmail.com> >
194+
* Ayush Chandekar < <ayu.chandekar@gmail.com> >
195+
* Lucas Seiki Oshiro < <lucasseikioshiro@gmail.com> >
196+
197+
### Implement promisor remote fetch ordering
198+
199+
When a Git repository is configured with multiple promisor remotes, there's
200+
currently no mechanism to specify or optimize the order in which these remotes
201+
should be queried when fetching missing objects. Different remotes may have
202+
different performance characteristics, costs, or reliability, making fetch
203+
order an important consideration.
204+
205+
This project aims to implement a fetch ordering mechanism for multiple promisor
206+
remotes. The order could be:
207+
- Configured locally by the client
208+
- Advertised by servers through the promisor-remote protocol
209+
210+
The key challenge is designing a flexible system that allows servers to
211+
communicate their preferred fetch order to clients (to ensure optimal
212+
performance and cost management).
213+
214+
**Getting started:** Build Git from source, set up a repository with multiple
215+
promisor remotes and experiment with object fetching, study how Git currently
216+
handles multiple remotes, review the promisor-remote protocol in
217+
`Documentation/gitprotocol-v2.txt`, understand partial clone implementation,
218+
and submit a micro-patch to demonstrate familiarity with the codebase.
219+
220+
**Resources:**
221+
- [Partial clone documentation](https://git-scm.com/docs/partial-clone)
222+
- [Git Protocol v2 documentation](https://git-scm.com/docs/gitprotocol-v2)
223+
224+
_Expected Project Size_: 175 hours or 350 hours
225+
226+
_Difficulty_: Medium to Hard
227+
228+
_Languages_: C, shell(bash)
229+
230+
_Possible mentors_:
231+
232+
* Christian Couder < <christian.couder@gmail.com> >
233+
* Karthik Nayak < <karthik.188@gmail.com> >
234+
* Justin Tobler < <jltobler@gmail.com> >
235+
* Siddharth Asthana < <siddharthasthana31@gmail.com> >
236+
* Ayush Chandekar < <ayu.chandekar@gmail.com> >
237+
* Lucas Seiki Oshiro < <lucasseikioshiro@gmail.com> >
238+
239+
### Enhance promisor-remote protocol for better-connected remotes
240+
241+
Currently, the promisor-remote protocol allows servers to advertise remotes
242+
that the server itself uses as promisor remotes. However, as suggested by
243+
Junio Hamano, it would be more useful if servers could advertise
244+
"better-connected" remotes - remotes that might not be promisor remotes
245+
for the server but would be good choices for the client.
246+
247+
This enhancement would allow servers to guide clients toward optimal remote
248+
configurations, potentially improving performance and reducing load on
249+
individual servers by distributing requests across a network of remotes.
250+
251+
This project involves:
252+
- Extending the promisor-remote protocol to support advertising
253+
better-connected remotes
254+
- Implementing server-side logic to determine and advertise appropriate remotes
255+
- Implementing client-side handling of these advertisements
256+
- Designing the protocol extension with backward compatibility in mind
257+
- Testing with various network topologies
258+
259+
**Getting started:** Build Git from source, study the current promisor-remote
260+
protocol implementation, read Junio's suggestion in `Documentation/gitprotocol-v2.txt`,
261+
understand how Git currently advertises and uses promisor remotes, set up test
262+
scenarios with multiple interconnected remotes, and submit a micro-patch to
263+
demonstrate familiarity with the codebase.
264+
265+
**Resources:**
266+
- [Partial clone documentation](https://git-scm.com/docs/partial-clone)
267+
- [Git Protocol v2 documentation - promisor remote section](https://git-scm.com/docs/gitprotocol-v2#_promisor_remotepr_info)
268+
269+
_Expected Project Size_: 175 hours or 350 hours
270+
271+
_Difficulty_: Hard
272+
273+
_Languages_: C, shell(bash)
274+
275+
_Possible mentors_:
276+
277+
* Christian Couder < <christian.couder@gmail.com> >
278+
* Karthik Nayak < <karthik.188@gmail.com> >
279+
* Justin Tobler < <jltobler@gmail.com> >
280+
* Siddharth Asthana < <siddharthasthana31@gmail.com> >
281+
* Ayush Chandekar < <ayu.chandekar@gmail.com> >
282+
* Lucas Seiki Oshiro < <lucasseikioshiro@gmail.com> >
283+
146284
### Complete and extend the `remote-object-info` command for `git cat-file`
147285

148286
From around June 2024 to March 2025, work was undertaken by Eric Ju to add a
@@ -188,10 +326,11 @@ _Languages_: C, shell(bash)
188326

189327
_Possible mentors_:
190328

191-
* Christian Couder < <christian.couder@gmail.com> >
192-
* Karthik Nayak < <karthik.188@gmail.com> >
193-
* Justin Tobler < <jltobler@gmail.com> >
194-
* Ayush Chandekar < <ayu.chandekar@gmail.com> >
195-
* Siddharth Asthana < <siddharthasthana31@gmail.com> >
196-
* Lucas Seiki Oshiro < <lucasseikioshiro@gmail.com> >
197-
* Chandra Pratap < <chandrapratap3519@gmail.com> >
329+
* Christian Couder < christian.couder@gmail.com >
330+
* Karthik Nayak < karthik.188@gmail.com >
331+
* Justin Tobler < jltobler@gmail.com >
332+
* Ayush Chandekar < ayu.chandekar@gmail.com >
333+
* Siddharth Asthana < siddharthasthana31@gmail.com >
334+
* Lucas Seiki Oshiro < lucasseikioshiro@gmail.com >
335+
* Chandra Pratap < chandrapratap3519@gmail.com >
336+

0 commit comments

Comments
 (0)