Summary
In this episode, we're talking about the pipe filter pattern in software architecture. This pattern is a great way to distribute tasks and problems to be solved, and it's applicable in many situations. We'll discuss the benefits and challenges of using this pattern, and how it can be used to solve problems with a stream of data.
Detailed Notes
The pipe filter pattern is a processing pattern for architecture that is used to solve problems with a stream of data. It's a great way to distribute tasks and problems to be solved, as each filter has well-defined inputs and outputs. The benefits of using this pattern include the ability to add, reduce, or swap filters out to do different manipulations of data for performance or scalability. However, the challenge with this pattern is that it can grow very large and become difficult to manage if not designed properly. It's essential to be intentional when building out the system and to design it with scalability in mind. The pipe filter pattern is applicable in many situations, including file processing, data streams, and transactional data. It's also a great way to distribute work among different teams and to test each filter individually before putting them together to create a system.
Highlights
- The pipe filter pattern is a processing pattern for architecture.
- It's a great way to distribute tasks and problems to be solved.
- You can add, reduce, or swap filters out to do different manipulations of data for performance or scalability.
- Each filter has well-defined inputs and outputs.
- You can test each filter individually and then put them together to create a system.
Key Takeaways
- The pipe filter pattern is a processing pattern for architecture that is used to solve problems with a stream of data.
- It's a great way to distribute tasks and problems to be solved, as each filter has well-defined inputs and outputs.
- The benefits of using this pattern include the ability to add, reduce, or swap filters out to do different manipulations of data for performance or scalability.
- However, the challenge with this pattern is that it can grow very large and become difficult to manage if not designed properly.
- It's essential to be intentional when building out the system and to design it with scalability in mind.
- The pipe filter pattern is applicable in many situations, including file processing, data streams, and transactional data.
Practical Lessons
- Use the pipe filter pattern to distribute tasks and problems to be solved.
- Design the system with scalability in mind.
- Test each filter individually before putting them together to create a system.
- Use tools and features to advance your career and solve problems.
Strong Lines
- It's a great way to distribute tasks and problems to be solved.
- You can add, reduce, or swap filters out to do different manipulations of data for performance or scalability.
- Each filter has well-defined inputs and outputs.
- You can test each filter individually and then put them together to create a system.
Blog Post Angles
- How to use the pipe filter pattern to solve problems with a stream of data.
- The benefits and challenges of using the pipe filter pattern.
- How to design a system using the pipe filter pattern with scalability in mind.
- How to use tools and features to advance your career and solve problems.
Keywords
- pipe filter pattern
- software architecture
- data streams
- transactional data
- file processing
- scalability
- performance
Transcript Text
Welcome to building better developers, the developer podcast, where we work on getting better step by step professionally and personally. Let's get started. Hello and welcome back. We are continuing our season where we're talking about patterns and any patterns in software architecture. We're starting off with some patterns. In this episode, we're going to talk about the pipe filter pattern. This is a little bit more of a processing pattern for architecture. Some of the things we've talked about are more general resource architecture layout when you talk about client server, even master slave, some things like that. Pipe filter is sort of probably as it implies a pipe. You've got a straight line shot of stuff and then you're applying filters, which is how this architecture works. The idea is that you have a pipe, a stream or something similar to that, maybe not a technical stream of data, but nevertheless data that we are pushing through the pipe. Along the way, we have filters and this is a very useful pattern in a lot of different areas. When you think of file processing in particular, it's maybe something where you first have a load of the file. You load the data in and then let's say you're going through and you're doing a doodoop and then or probably before that you're doing some sort of a cleanup. You're removing, filtering any data that's invalid and then maybe you have some sort of a filter that is a data correction. For example, phone numbers, you're bringing in phone numbers and in some cases they have dots and some cases they have dashes. In some cases they don't have anything other than numbers. Some cases they may have parentheses and dashes. They have all of these different formats. Dates are the same thing. There may be all sorts of different formats. So you have filters to clean in quotes maybe or to standardize the format and eventually you have a filter maybe that does a doodoop of some sort or does some sort of a final processing that takes that data that has been cleaned and applies it into a system, updates a record, something along those lines. Now that may be a process that's done outside of the load and the final save or maybe the two ends of the pipe, but then along the way you have these filters that are doing some work on the data that you're passing through. The key to the pipe filter pattern is each filter has very well defined inputs and outputs. Generally speaking, you are bringing the data in and there is an assumption with that filter the data is of a certain format or in a certain structure or something like that. And then the output probably is going to be that same structure. So for example, again, let's say you've got an XML file. XML comes in, XML goes out. And that's very nice because it's a structured data stream. So we can bring it in, we can look for certain properties or values or attributes, things like that, make some changes, maybe even add to it, remove, clean up, whatever, and spit out an XML formatted stream on the other end. And then you could put another filter in and do different work. And that's one of the values of the pipe filter pattern is that you have this pipe, you know data is coming in, you know data is going out. And then you put a filter that is essentially matching the incoming data in structure and the outgoing data in structure. And so theoretically, you could put as many filters in there as possible because each filter is also like a mini pipe. It has the same format coming in, the same format going out. And therefore you can just keep sliding in filters as you need to. Now there may or may not be some sort of a prerequisite or some sort of order of execution that you need for those filters. Now generally in a pipe there is when you put filters into a pipe, there is a set order of execution of the filters. But maybe you have some of them that you could like swap them, you can mix and match them. More importantly, you should be able to remove all filters, save one. And that allows you testing wise to simply put one filter in and make sure that it is properly filtering and then test the next one, the next one, the next one. So you can individually test all the filters and then you can start piecing them together and create a system. If you think of Lego blocks and Lego bricks, you can take a brick and then stack another one on top of it, another one on top of it, another one on top of it. Because generally speaking, and you can think of those as like a pipe with filters, because generally speaking it's the same input and output. You either have the top end that are the set size nubs or the bottom end that essentially accepts those nubs. And so it makes it very easy to put extra stuff in and you could, just like a filter, you could change out the colors of blocks and things of that nature. Filters are the same thing. You can add, reduce or swap them out to do either different manipulations of data for performance issues, for scalability. There's all kinds of different reasons that you might want to plug and play your filters into your pipe pattern, your pipe filter pattern. Technically it's fairly straightforward. You just need something that accepts data in the same format as the pipe and kicks it out in the same format that the pipe expects. And then within it, you're just taking data in, pushing it back out. You just thread the data through the filters and you're off and running. And this also allows the filters within the pattern to be fairly simple. Thinking about my initial discussion about just sucking in a file, well maybe you have filters, a whole series of filters that one just filters addresses, one filters numbers, one filters phone numbers, one filters dates, one filters customer records. You could have all of these little filters that you stack up and any one filter may do very little. It may have a very specific piece of work which allows you to take something that could be very complex and chop it up into smaller bite sized pieces that could even be individually created. You don't need to know as a implementer of a filter, a creator of a filter. You don't need to know that there's other filters out there. You don't need to know anything other than the input and the output. So whether there's one filter or 4,000 filters in the pipe filter pattern, you don't care. You focus on yours. So it's not only a great way to distribute the tasks and the problems to be solved. You can also distribute that out to different teams. This could be something that you have some sort of secret sauce, some sort of development, top secret development that you're doing. Through the pipe filter, you can send out all of those different filters to be developed by different people and you don't actually know what is the end result until you have all of the filters together. I think there's probably some spy movie or novel or things like that that roughly work on that same thing. I'm pretty sure there's a lot of super villain, James Bond kinds of things where the only way or some sort of like the mummy's tomb or something like that where until you have all of the filters together, all of those pieces in place, you weren't able to figure out what the end result is. It ends up in a special color or super secret staff or whatever the heck it is. It's basically you have the pieces, you don't have the complete solution. So this again, it's a way that you can do it without needing to get your secret sauce out. It probably isn't going to be data dependent. So if you're doing something like health tarot data where you have to worry about HIPAA requirements or financial data where you don't want account numbers to be valid and things like that, you can always use mocked up data. But in this case, it's even easier because it's such a simple input and output. The data roughly needs to look like this. Boom, the data looks like that. It may look different when you go to production. Obviously, you want to test that at the end with the system and make sure that all of the filters do what they're supposed to do and don't step on each other in some way. Other than that, it's easy to farm out that work. And it's going to ski though, is that each step within that is its own little sub system. This is one problem and one solution that it does. Or I guess it can solve a couple of different problems, but it's self-contained within that filter. The challenge with the pipe filter pattern is that it can grow very, very big. The whole idea of, oh, hey, we'll just throw another filter in, can get out of hand. You could have 4,000 filters and suddenly something that is cruising right along and processing well grinds to a halt because there's just too much going on. There are too many filters in the system. And again, because it's relatively simple to put one in, if you don't design things properly and if you're not intentional in how you're building out the system, then you can end up in this very nasty scope creep kind of thing where it's like, oh, well, we'll just throw another filter and we'll just throw another filter and we'll just throw another filter. The next thing you know, you have this huge ridiculous backlog of filters that you've built or that you have to build. And eventually at some point there's just too many. There's so many filters that it essentially jams the pipe. It keeps it so things can't move. Just like anything else, if you think of water flowing through a pipe, if you have a filter with big holes, then it's going to be able to filter out bigger items. And then maybe eventually have smaller filters to filter out other stuff. And of course, you may be, you know, you set up your filters. So you start with larger items and eventually get down to very small items. But eventually you're going to get to something that it takes a while for the water to go through that filter because of what it's doing. Or you know, at any step of the way, putting that filter in is going to slow the processing from point A to point B. If you put a filter, it is some sort of friction. It is some, there's some cost in time and, you know, maybe in our world processing or and or memory. So those are things that you need to keep in mind when you're applying the pipe filter pattern. Again, you may have multiple times that you use this pattern within an overarching system because it's really about solving problem with a stream of, you know, usually data. In our case, you're pushing data through and then you're doing these filters. You're doing these adjustments on it. You may see that numerous times. It may be done for your customer records, for your accounting records, for your order records, for company records, employee records, whatever it is. It may be for your your sales data. It may be for your maybe data that's coming in from transactions or truck drivers and where they're at in their locations. And who knows? It's just data. It's going from point A to point B. And along the way, we're going to do some stuff with it. And this is applicable in enough situations that there are tools out there that help you do such things. There are and right now a lot of it is data streams and you can see a lot of things that are focusing on the whole idea of streams because there's just so much data going through that we just plop a little filter in and allow that to do the work that we need to be done on it. That being said, I think that's all that really needs to be said about the pipe filter pattern. Again, it's one of those software architecture ones that you probably have used. Maybe didn't realize the name of it, but now it's something to keep in your, you know, put another that arrow in your quiver and it's something your arsenal next time you need to solve a problem. That'll wrap it up this time and we will come back next episode and we will talk about yet another software architecture pattern. But as always go out there and have yourself a great day, a great week, and we will talk to you. Great momentum and great success. There are two things I want to mention to help you get a little further along in your embracing of the content of Developineur. One is the book, The Source Code of Happiness. You can find links to it on our page out on the Developineur site. You can also find it on Amazon, search for Rob Broadhead or Source Code of Happiness. You can get it on Kindle. If you're an Amazon Prime member, you can read it free. A lot of good information there. That'll be a lot easier than trying to dig through all of our past blog posts. The other thing is our mastermind slash mentor group. We meet roughly every other week and this is an opportunity to meet with some other people from a lot of different areas of IT. We have a presentation every time we talk about some cool tools and features and things that we've come across, things that we've learned, things that you can use to advance your career today. Just shoot us an email at info at Developineur.com if you would like more information. Now go out there and have yourself a great one.