I thought I'd take a crack at answering my own question. What follows is just one way of solving the issues 1-3 in my original question.
Disclaimer: I may not always use the right terms when describing patterns or techniques. Sorry for that.
The Goals:
- Create a complete example of a basic controller for viewing and editing
Users. - All code must be fully testable and mockable.
- The controller should have no idea where the data is stored (meaning it can be changed).
- Example to show a SQL implementation (most common).
- For maximum performance, controllers should only receive the data they need—no extra fields.
- Implementation should leverage some type of data mapper for ease of development.
- Implementation should have the ability to perform complex data lookups.
The Solution
I'm splitting my persistent storage (database) interaction into two categories: R (Read) and CUD (Create, Update, Delete). My experience has been that reads are really what causes an application to slow down. And while data manipulation (CUD) is actually slower, it happens much less frequently, and is therefore much less of a concern.
CUD (Create, Update, Delete) is easy. This will involve working with actual models, which are then passed to my Repositories for persistence. Note, my repositories will still provide a Read method, but simply for object creation, not display. More on that later.
R (Read) is not so easy. No models here, just value objects. Use arrays if you prefer. These objects may represent a single model or a blend of many models, anything really. These are not very interesting on their own, but how they are generated is. I'm using what I'm calling Query Objects.
The Code:
User Model
Let's start simple with our basic user model. Note that there is no ORM extending or database stuff at all. Just pure model glory. Add your getters, setters, validation, whatever.
class User
{
public $id;
public $first_name;
public $last_name;
public $gender;
public $email;
public $password;
}
Repository Interface
Before I create my user repository, I want to create my repository interface. This will define the "contract" that repositories must follow in order to be used by my controller. Remember, my controller will not know where the data is actually stored.
Note that my repositories will only every contain these three methods. The save() method is responsible for both creating and updating users, simply depending on whether or not the user object has an id set.
interface UserRepositoryInterface
{
public function find($id);
public function save(User $user);
public function remove(User $user);
}
SQL Repository Implementation
Now to create my implementation of the interface. As mentioned, my example was going to be with an SQL database. Note the use of a data mapper to prevent having to write repetitive SQL queries.
class SQLUserRepository implements UserRepositoryInterface
{
protected $db;
public function __construct(Database $db)
{
$this->db = $db;
}
public function find($id)
{
// Find a record with the id = $id
// from the 'users' table
// and return it as a User object
return $this->db->find($id, 'users', 'User');
}
public function save(User $user)
{
// Insert or update the $user
// in the 'users' table
$this->db->save($user, 'users');
}
public function remove(User $user)
{
// Remove the $user
// from the 'users' table
$this->db->remove($user, 'users');
}
}
Query Object Interface
Now with CUD (Create, Update, Delete) taken care of by our repository, we can focus on the R (Read). Query objects are simply an encapsulation of some type of data lookup logic. They are not query builders. By abstracting it like our repository we can change it's implementation and test it easier. An example of a Query Object might be an AllUsersQuery or AllActiveUsersQuery, or even MostCommonUserFirstNames.
You may be thinking "can't I just create methods in my repositories for those queries?" Yes, but here is why I'm not doing this:
- My repositories are meant for working with model objects. In a real world app, why would I ever need to get the
passwordfield if I'm looking to list all my users? - Repositories are often model specific, yet queries often involve more than one model. So what repository do you put your method in?
- This keeps my repositories very simple—not an bloated class of methods.
- All queries are now organized into their own classes.
- Really, at this point, repositories exist simply to abstract my database layer.
For my example I'll create a query object to lookup "AllUsers". Here is the interface:
interface AllUsersQueryInterface
{
public function fetch($fields);
}
Query Object Implementation
This is where we can use a data mapper again to help speed up development. Notice that I am allowing one tweak to the returned dataset—the fields. This is about as far as I want to go with manipulating the performed query. Remember, my query objects are not query builders. They simply perform a specific query. However, since I know that I'll probably be using this one a lot, in a number of different situations, I'm giving myself the ability to specify the fields. I never want to return fields I don't need!
class AllUsersQuery implements AllUsersQueryInterface
{
protected $db;
public function __construct(Database $db)
{
$this->db = $db;
}
public function fetch($fields)
{
return $this->db->select($fields)->from('users')->orderBy('last_name, first_name')->rows();
}
}
Before moving on to the controller, I want to show another example to illustrate how powerful this is. Maybe I have a reporting engine and need to create a report for AllOverdueAccounts. This could be tricky with my data mapper, and I may want to write some actual SQL in this situation. No problem, here is what this query object could look like:
class AllOverdueAccountsQuery implements AllOverdueAccountsQueryInterface
{
protected $db;
public function __construct(Database $db)
{
$this->db = $db;
}
public function fetch()
{
return $this->db->query($this->sql())->rows();
}
public function sql()
{
return "SELECT...";
}
}
This nicely keeps all my logic for this report in one class, and it's easy to test. I can mock it to my hearts content, or even use a different implementation entirely.
The Controller
Now the fun part—bringing all the pieces together. Note that I am using dependency injection. Typically dependencies are injected into the constructor, but I actually prefer to inject them right into my controller methods (routes). This minimizes the controller's object graph, and I actually find it more legible. Note, if you don't like this approach, just use the traditional constructor method.
class UsersController
{
public function index(AllUsersQueryInterface $query)
{
// Fetch user data
$users = $query->fetch(['first_name', 'last_name', 'email']);
// Return view
return Response::view('all_users.php', ['users' => $users]);
}
public function add()
{
return Response::view('add_user.php');
}
public function insert(UserRepositoryInterface $repository)
{
// Create new user model
$user = new User;
$user->first_name = $_POST['first_name'];
$user->last_name = $_POST['last_name'];
$user->gender = $_POST['gender'];
$user->email = $_POST['email'];
// Save the new user
$repository->save($user);
// Return the id
return Response::json(['id' => $user->id]);
}
public function view(SpecificUserQueryInterface $query, $id)
{
// Load user data
if (!$user = $query->fetch($id, ['first_name', 'last_name', 'gender', 'email'])) {
return Response::notFound();
}
// Return view
return Response::view('view_user.php', ['user' => $user]);
}
public function edit(SpecificUserQueryInterface $query, $id)
{
// Load user data
if (!$user = $query->fetch($id, ['first_name', 'last_name', 'gender', 'email'])) {
return Response::notFound();
}
// Return view
return Response::view('edit_user.php', ['user' => $user]);
}
public function update(UserRepositoryInterface $repository)
{
// Load user model
if (!$user = $repository->find($id)) {
return Response::notFound();
}
// Update the user
$user->first_name = $_POST['first_name'];
$user->last_name = $_POST['last_name'];
$user->gender = $_POST['gender'];
$user->email = $_POST['email'];
// Save the user
$repository->save($user);
// Return success
return true;
}
public function delete(UserRepositoryInterface $repository)
{
// Load user model
if (!$user = $repository->find($id)) {
return Response::notFound();
}
// Delete the user
$repository->delete($user);
// Return success
return true;
}
}
Final Thoughts:
The important things to note here are that when I'm modifying (creating, updating or deleting) entities, I'm working with real model objects, and performing the persistance through my repositories.
However, when I'm displaying (selecting data and sending it to the views) I'm not working with model objects, but rather plain old value objects. I only select the fields I need, and it's designed so I can maximum my data lookup performance.
My repositories stay very clean, and instead this "mess" is organized into my model queries.
I use a data mapper to help with development, as it's just ridiculous to write repetitive SQL for common tasks. However, you absolutely can write SQL where needed (complicated queries, reporting, etc.). And when you do, it's nicely tucked away into a properly named class.
I'd love to hear your take on my approach!
July 2015 Update:
I've been asked in the comments where I ended up with all this. Well, not that far off actually. Truthfully, I still don't really like repositories. I find them overkill for basic lookups (especially if you're already using an ORM), and messy when working with more complicated queries.
I generally work with an ActiveRecord style ORM, so most often I'll just reference those models directly throughout my application. However, in situations where I have more complex queries, I'll use query objects to make these more reusable. I should also note that I always inject my models into my methods, making them easier to mock in my tests.
Answer from Jonathan on Stack OverflowI thought I'd take a crack at answering my own question. What follows is just one way of solving the issues 1-3 in my original question.
Disclaimer: I may not always use the right terms when describing patterns or techniques. Sorry for that.
The Goals:
- Create a complete example of a basic controller for viewing and editing
Users. - All code must be fully testable and mockable.
- The controller should have no idea where the data is stored (meaning it can be changed).
- Example to show a SQL implementation (most common).
- For maximum performance, controllers should only receive the data they need—no extra fields.
- Implementation should leverage some type of data mapper for ease of development.
- Implementation should have the ability to perform complex data lookups.
The Solution
I'm splitting my persistent storage (database) interaction into two categories: R (Read) and CUD (Create, Update, Delete). My experience has been that reads are really what causes an application to slow down. And while data manipulation (CUD) is actually slower, it happens much less frequently, and is therefore much less of a concern.
CUD (Create, Update, Delete) is easy. This will involve working with actual models, which are then passed to my Repositories for persistence. Note, my repositories will still provide a Read method, but simply for object creation, not display. More on that later.
R (Read) is not so easy. No models here, just value objects. Use arrays if you prefer. These objects may represent a single model or a blend of many models, anything really. These are not very interesting on their own, but how they are generated is. I'm using what I'm calling Query Objects.
The Code:
User Model
Let's start simple with our basic user model. Note that there is no ORM extending or database stuff at all. Just pure model glory. Add your getters, setters, validation, whatever.
class User
{
public $id;
public $first_name;
public $last_name;
public $gender;
public $email;
public $password;
}
Repository Interface
Before I create my user repository, I want to create my repository interface. This will define the "contract" that repositories must follow in order to be used by my controller. Remember, my controller will not know where the data is actually stored.
Note that my repositories will only every contain these three methods. The save() method is responsible for both creating and updating users, simply depending on whether or not the user object has an id set.
interface UserRepositoryInterface
{
public function find($id);
public function save(User $user);
public function remove(User $user);
}
SQL Repository Implementation
Now to create my implementation of the interface. As mentioned, my example was going to be with an SQL database. Note the use of a data mapper to prevent having to write repetitive SQL queries.
class SQLUserRepository implements UserRepositoryInterface
{
protected $db;
public function __construct(Database $db)
{
$this->db = $db;
}
public function find($id)
{
// Find a record with the id = $id
// from the 'users' table
// and return it as a User object
return $this->db->find($id, 'users', 'User');
}
public function save(User $user)
{
// Insert or update the $user
// in the 'users' table
$this->db->save($user, 'users');
}
public function remove(User $user)
{
// Remove the $user
// from the 'users' table
$this->db->remove($user, 'users');
}
}
Query Object Interface
Now with CUD (Create, Update, Delete) taken care of by our repository, we can focus on the R (Read). Query objects are simply an encapsulation of some type of data lookup logic. They are not query builders. By abstracting it like our repository we can change it's implementation and test it easier. An example of a Query Object might be an AllUsersQuery or AllActiveUsersQuery, or even MostCommonUserFirstNames.
You may be thinking "can't I just create methods in my repositories for those queries?" Yes, but here is why I'm not doing this:
- My repositories are meant for working with model objects. In a real world app, why would I ever need to get the
passwordfield if I'm looking to list all my users? - Repositories are often model specific, yet queries often involve more than one model. So what repository do you put your method in?
- This keeps my repositories very simple—not an bloated class of methods.
- All queries are now organized into their own classes.
- Really, at this point, repositories exist simply to abstract my database layer.
For my example I'll create a query object to lookup "AllUsers". Here is the interface:
interface AllUsersQueryInterface
{
public function fetch($fields);
}
Query Object Implementation
This is where we can use a data mapper again to help speed up development. Notice that I am allowing one tweak to the returned dataset—the fields. This is about as far as I want to go with manipulating the performed query. Remember, my query objects are not query builders. They simply perform a specific query. However, since I know that I'll probably be using this one a lot, in a number of different situations, I'm giving myself the ability to specify the fields. I never want to return fields I don't need!
class AllUsersQuery implements AllUsersQueryInterface
{
protected $db;
public function __construct(Database $db)
{
$this->db = $db;
}
public function fetch($fields)
{
return $this->db->select($fields)->from('users')->orderBy('last_name, first_name')->rows();
}
}
Before moving on to the controller, I want to show another example to illustrate how powerful this is. Maybe I have a reporting engine and need to create a report for AllOverdueAccounts. This could be tricky with my data mapper, and I may want to write some actual SQL in this situation. No problem, here is what this query object could look like:
class AllOverdueAccountsQuery implements AllOverdueAccountsQueryInterface
{
protected $db;
public function __construct(Database $db)
{
$this->db = $db;
}
public function fetch()
{
return $this->db->query($this->sql())->rows();
}
public function sql()
{
return "SELECT...";
}
}
This nicely keeps all my logic for this report in one class, and it's easy to test. I can mock it to my hearts content, or even use a different implementation entirely.
The Controller
Now the fun part—bringing all the pieces together. Note that I am using dependency injection. Typically dependencies are injected into the constructor, but I actually prefer to inject them right into my controller methods (routes). This minimizes the controller's object graph, and I actually find it more legible. Note, if you don't like this approach, just use the traditional constructor method.
class UsersController
{
public function index(AllUsersQueryInterface $query)
{
// Fetch user data
$users = $query->fetch(['first_name', 'last_name', 'email']);
// Return view
return Response::view('all_users.php', ['users' => $users]);
}
public function add()
{
return Response::view('add_user.php');
}
public function insert(UserRepositoryInterface $repository)
{
// Create new user model
$user = new User;
$user->first_name = $_POST['first_name'];
$user->last_name = $_POST['last_name'];
$user->gender = $_POST['gender'];
$user->email = $_POST['email'];
// Save the new user
$repository->save($user);
// Return the id
return Response::json(['id' => $user->id]);
}
public function view(SpecificUserQueryInterface $query, $id)
{
// Load user data
if (!$user = $query->fetch($id, ['first_name', 'last_name', 'gender', 'email'])) {
return Response::notFound();
}
// Return view
return Response::view('view_user.php', ['user' => $user]);
}
public function edit(SpecificUserQueryInterface $query, $id)
{
// Load user data
if (!$user = $query->fetch($id, ['first_name', 'last_name', 'gender', 'email'])) {
return Response::notFound();
}
// Return view
return Response::view('edit_user.php', ['user' => $user]);
}
public function update(UserRepositoryInterface $repository)
{
// Load user model
if (!$user = $repository->find($id)) {
return Response::notFound();
}
// Update the user
$user->first_name = $_POST['first_name'];
$user->last_name = $_POST['last_name'];
$user->gender = $_POST['gender'];
$user->email = $_POST['email'];
// Save the user
$repository->save($user);
// Return success
return true;
}
public function delete(UserRepositoryInterface $repository)
{
// Load user model
if (!$user = $repository->find($id)) {
return Response::notFound();
}
// Delete the user
$repository->delete($user);
// Return success
return true;
}
}
Final Thoughts:
The important things to note here are that when I'm modifying (creating, updating or deleting) entities, I'm working with real model objects, and performing the persistance through my repositories.
However, when I'm displaying (selecting data and sending it to the views) I'm not working with model objects, but rather plain old value objects. I only select the fields I need, and it's designed so I can maximum my data lookup performance.
My repositories stay very clean, and instead this "mess" is organized into my model queries.
I use a data mapper to help with development, as it's just ridiculous to write repetitive SQL for common tasks. However, you absolutely can write SQL where needed (complicated queries, reporting, etc.). And when you do, it's nicely tucked away into a properly named class.
I'd love to hear your take on my approach!
July 2015 Update:
I've been asked in the comments where I ended up with all this. Well, not that far off actually. Truthfully, I still don't really like repositories. I find them overkill for basic lookups (especially if you're already using an ORM), and messy when working with more complicated queries.
I generally work with an ActiveRecord style ORM, so most often I'll just reference those models directly throughout my application. However, in situations where I have more complex queries, I'll use query objects to make these more reusable. I should also note that I always inject my models into my methods, making them easier to mock in my tests.
Based on my experience, here are some answers to your questions:
Q: How do we deal with bringing back fields we don't need?
A: From my experience this really boils down to dealing with complete entities versus ad-hoc queries.
A complete entity is something like a User object. It has properties and methods, etc. It's a first class citizen in your codebase.
An ad-hoc query returns some data, but we don't know anything beyond that. As the data gets passed around the application, it is done so without context. Is it a User? A User with some Order information attached? We don't really know.
I prefer working with full entities.
You are right that you will often bring back data you won't use, but you can address this in various ways:
- Aggressively cache the entities so you only pay the read price once from the database.
- Spend more time modeling your entities so they have good distinctions between them. (Consider splitting a large entity into two smaller entities, etc.)
- Consider having multiple versions of entities. You can have a
Userfor the back end and maybe aUserSmallfor AJAX calls. One might have 10 properties and one has 3 properties.
The downsides of working with ad-hoc queries:
- You end up with essentially the same data across many queries. For example, with a
User, you'll end up writing essentially the sameselect *for many calls. One call will get 8 of 10 fields, one will get 5 of 10, one will get 7 of 10. Why not replace all with one call that gets 10 out of 10? The reason this is bad is that it is murder to re-factor/test/mock. - It becomes very hard to reason at a high level about your code over time. Instead of statements like "Why is the
Userso slow?" you end up tracking down one-off queries and so bug fixes tend to be small and localized. - It's really hard to replace the underlying technology. If you store everything in MySQL now and want to move to MongoDB, it's a lot harder to replace 100 ad-hoc calls than it is a handful of entities.
Q: I will have too many methods in my repository.
A: I haven't really seen any way around this other than consolidating calls. The method calls in your repository really map to features in your application. The more features, the more data specific calls. You can push back on features and try to merge similar calls into one.
The complexity at the end of the day has to exist somewhere. With a repository pattern we've pushed it into the repository interface instead of maybe making a bunch of stored procedures.
Sometimes I have to tell myself, "Well it had to give somewhere! There are no silver bullets."
Videos
There is no definitive answer for your question since the patterns you use highly depend on the project's complexity and needs.
However, a Service and a Repository are two different things. The Repository is a common wrapper for the model and is where you write the queries to the database. IMO you shouldn't add logic here and the sole purpose of a repository is to grab os store data into the database. The advantage of Repositories is the "easiness" to switch to other database systems.
A Service, IMO, is where you add all the application's logic.
For additional information refer to this answer.
From DDD (Domain Driven Design) the responsibility of a repository is to take care of loading, storing, modifying and deleting an entity on the designated data storage (which may or may not be a database -- it may even be a remote server or just a file).
A service, on the other hand, has (or should have) a very narrow responsibility of performing some useful activity. Each service is instantiated separately and then injected into code in the application layer or above, which acts as a bridge (Bridge pattern). This approach has proven to be very advantageous because it allows to manage the dependencies between otherwise unrelated (uncoupled) code.
Those two definitions and the origin of the concepts shows that they actually are very different things. By pure chance you noticed that a repository and a service have an apparent overlap, but that's due to implementation details or plain misuse. Their responsibilities may under circumstances go hand in hand (giving rise to a collaboration) but they really are orthogonal concepts.
Furthermore, Repositories should arise from a deep layer (Persistance or DAL, Data Access Layer). Services, on the other hand, often are vertical cross-cutting or arise on the application layer.
Through proper layering the differences between repositories and services become even more apparent.
Do not think about them as pure code artifacts you can move around. They are well-defined concepts useful to understand about and design the structure of a system. They decline into actual code only as a consequence of that design.
I hope I succeeded in writing something that clears up some ideas and is not confusing.
Am I using these patterns correctly?
Yes, you are. General advice I'd give is not to give your classes in general and services in particular too many responsibilities: follow single responsibility principle, which, basically states "my class should have only one reason to change" (that's M.Fowler, as far as I remember that actually was R. Martin, thanks to Gordon's correction in the comment).
Your UserService seems to be handling too many different tasks: it handles registration and grabs followers. And probably sends SMS. Extract registration-related logic into UserRegistrationService class.
If you have a limited knowledge of application architecture, I would suggest you to read this book about design patterns first: http://amzn.eu/aNVH8Ii
The second point is not to use slim framework. It is a small framework for people, who already know what they want to build and how to do it. Definitely not a framework to learn any patterns or application architecture.
I would suggest to take a look at Yii 2: http://www.yiiframework.com/doc-2.0/guide-index.html
Yii uses most design patterns and architecture solutions, which are commonly used in big applications today, and is easy to learn and understand.
Hi, r/PHP. I'm having trouble understanding Repository pattern.
As far as I know, Repository simlulates an in-memory collection and incapsulates persistence-related code under the hood. Does that mean that my Repositories should not have public persist method? I mean if I had a collection of Users in an array or in some UserCollection object, I wouldn't.
Currently I'm implementing primitive repositories and work with them like this:
$user = $userRepository->getById($userId); $user->block(); $userRepository->persist($user);
But since we're talking about a collection, there shouldn't really be a public Repository::persist(), right? It should decide on its own when to actually persist, and client code does not need to bother. Right? So the client code should look like this:
$user = $this->repository->getById($userId); $user->block();
Or I'm wrong about it?
EDIT
I'll make my question more clear. Is it correct that being a collection a Repository should only have add and delete as public "persistence" methods, and should handle actual writing internally (e.g. in a destructor)?