Identifying unmaintained open source packages at scale
Open source software often comes out of a developer solving their own problem and giving the solution away to the community. Sometimes this surfaces as a one-time code dump, but more commonly the original developer sticks around to maintain their work, adding features and fixing bugs. Eventually the original developer may no longer be interested in continuing to maintain a package, at which point it is either taken over by other contributors or abandoned.
Detecting abandoned packages is important because an abandoned package may not receive security fixes. It also may not receive compatibility patches for new versions of the underlying language or other packages. Some packages are explicitly abandoned by their authors, but many more enter a silent state where it’s unclear whether the maintainer is still around or not.
Infield detects abandoned dependencies using a combination of factors, including release cadence, commit history, maintainer comments, and community behavior. We’ve trained a model to combine these inputs to score the abandonment level of a dependency in order to predict dependencies that might be at risk. Our customers can then take action on these dependencies within their repos.
Here’s what we consider in determining a package’s abandonment potential.
Release cadence
Historical release cadence can predict the next “expected release” for a package. For instance, if a package historically averaged one month between releases, but it’s been a year since the last release, that package might be abandoned. Conversely, if a package is released annually, and it’s been a year, that’s less likely to be an abandoned package. We combine release cadence and absolute release staleness together with the following formula:
where
X = average days between releases
Y = days since last release
T = absolute time scale (constant)
p,q = weight parameters (constant)
Commit history
Some packages are in “maintenance mode” where the original maintainer might have left, but collaborators with repo access are still contributing. In this case we want to consider the freshness of new commits to the repo in addition to official releases.
Maintainer comments
The most clear indication of an abandoned package is official communication by the maintainers. We detect these in a few ways:
Maintainer marks a repo as archived or deletes it
Maintainer responds to a Github issue or open pull request noting that the package is no longer maintained. We use language models to detect these.
Maintainer fails to respond to any Github issues or any pull requests
Community behavior
We can detect abandoned packages in part by looking at how the community responds. For example, if we find a fork of a package with fresher commit history than the upstream source, that’s an indication that the community is taking over maintenance of this package.
We combine these factors using a machine learning model. We labeled hundreds of packages ourselves and now use the resulting model to predict “abandonment likelihood” as well as a true/false “abandoned” label.
If you’re interested in reading more about this problem, we suggest this academic paper coming out of China.
If you want to use Infield to track and manage your own open source dependencies, you can easily get started for free on our Get Started page. We currently support Ruby, Python, and Javascript packages, with more languages coming soon.


