Weekly Dev Tips

Do I Need a Repository?

Episode Summary

This week we'll answer this extremely common question about the Repository pattern, and when you should think about using it.

Episode Notes

Do I Need a Repository?

This week we'll answer this extremely common question about the Repository pattern, and when you should think about using it.

Sponsor - DevIQ

Thanks to DevIQ for sponsoring this episode! Check out their list of available courses and how-to videos.

Show Notes / Transcript

This week we're going to return to the Repository design pattern to answer a very common question: when should you use it? This question appears very frequently in discussions about Entity Framework or EF Core, usually with someone saying "Since EF already acts like a repository, why would you create your own repository pattern on top of it?"

Before we get into the answer to this question, though, let me point out that if you're interested in the repository pattern in general I have a link to a very useful EF Core implementation in the show notes for this episode that should help get you started or perhaps give you some ideas you can use with your existing implementation. Also, just a reminder that we talked about the pattern in episode 18 on query logic encapsulation, but otherwise I haven't spent a lot of time on repository tips here, yet.

Ok, so on to this week's topic. Should you bother using the repository pattern when you're working with EF or EF Core, since these already act like a repository? If you Google for this, you're likely to discover an article discussing this topic that suggests repository isn't useful. In setting the scene, the author discusses an app he inherited that had performance issues caused by lazy loading, which he says "was needed because the application used the repository/unit of work pattern."

Before going further, let's point out two things. One, lazy loading in web applications is evil. Just don't do it except maybe for internal apps that have very few users and very small data sets. Read my article on why, linked from the show notes. Second, no, you don't need lazy loading if you're using repository. You just need to know how to pass query and loading information into the repository.

The author later goes on to say "one of the ideas behind repository is that you might replace EF Core with another database access library but my view it's a misconception because a) it's very hard to replace a database access library, and b) are you really going to?" I agree that it's very hard to replace your data access library, unless you put it behind a good abstraction. As to whether you're going to, that's a tougher one to answer. I've personally seen organizations change data access between raw ADO.NET, Enterprise Application Block, Typed Datasets, LINQ-to-SQL, LLBLGen, NHibernate, EF, and EF Core. I've probably forgotten a couple. Oh yeah, and Dapper and other "micro-ORMs", too. If you're using an abstraction layer, you can swap out these implementation details quickly and easily. You just write a new class that is essentially an adapter of your repository to that particular tool. If you're hardcoded to any one of them, it's going to be a much bigger job (and so, yeah, you're less likely to do it because of the pain involved.)

Next, the author lists some of the bad parts of using repository. First, sorting and filtering, because a particular implementation he found from 2013 only returned an IEnumerable and didn't provide a way to allow filtering and sorting to be done in the database. Yes, poor implementations of a pattern can result in poor performance. Don't do that if performance is important. Next, he hits on lazy loading again. Ironically, at the time this article was published, EF Core didn't even support lazy loading, so this couldn't be a problem with it. Unfortunately, now it does, but as I mentioned, you shouldn't use it in web apps anyway. It has nothing to do with repository, despite the author thinking they're linked somehow. His third perf-related issue is with updates, claiming that a repository around EF Core would require saving every property, not just those that have changed. This is also untrue. You can use EF Core's change tracking capability with and through a repository just fine.

His fourth and final "bad part" of repositories when used with EF Core is that they're too generic. You can write one generic repository and then use that or subtype from it. He notes that it should minimize the code you need to write, but in his experience as things grow in complexity you end up writing more and more code in the individual repositories. Having less code to write and maintain really is a good thing. The issue with complexity resulting in more and more code in repositories is a symptom of not using another pattern, the specification. In fact, the specification pattern addresses pretty much all of the issues described in his post that I haven't already debunked. The author knows about this pattern, which he describes as 'query objects', but doesn't see how they can be used together with repositories just as effectively as he uses them instead of repositories.

One last thing I want to point out that many folks (including the author of this article) misunderstand is the idea of being able to unit test code that works with data. This might just come down to the definition of a unit test, so I'll start with that. A unit test is a test that only tests your code at the unit level. That typically means a single method, or at most a class since to access a single method you may need to create an instance of a class and thus also execute its constructor, etc. If you have a test that tests more than one class working together, or that depends on code that isn't yours (like, say, an ORM), it's not a unit test. It's an integration test.

The author goes on to suggest that since EF Core supports an in-memory database, you can use that for unit-testing your application. You can't. You can use it for integration testing, which is great. But it's not unit testing. The distinction is important because clean code should be unit testable. If it isn't, it's a code smell, suggesting that you may have too much coupling. You might be OK with that, but you should at least be aware of the issue so you can decide for yourself whether you're OK with it, rather than having a false sense of complacency because your integration tests work well enough.

Would your team or application benefit from an application assessment, highlighting potential problem areas and identifying a path toward better maintainability? Contact me at ardalis.com and let's see how I can help.

Show Resources and Links