The Entity, Attribute and Value (EAV) database architecture is at first, extremely difficult to grasp. Combine this with the lack of documentation on EAV and you find that most people don't truly appreciate just how good EAV is and more importantly, how well suited it is to Magento. In this article I will attempt to shed some light on EAV in the hopes of helping you understand how it works, why it works and how this knowledge can benefit you as a Magento developer. To better understand this article, I recommend opening up a development Magento database using a tool such as phpMyAdmin.
What is EAV?
EAV stands for Entity, Attribute and Value. Let's look at each part of that and try to understand them a little better.
The entity represents Magento data items such as products, categories, customers and orders. Each entity (product, category etc) will have it's own entity record in the database.
The attributes represent data items that belong to an entity. For example, the product entity has attributes such as name, price, status and many more.
The value is the simplest to understand as it is simply a value linked to an attribute.
To better understand this, let us consider the product entity. Each product entity will have a series of attributes, one being the attribute name. Each product will then have a value for the attribute name (and all other attributes). This might not be clear yet but keep reading!
How does EAV work?
Before Magento, databases seemed a lot simpler. If you were designing an eCommerce application, you had one table that contained all of your product information, another that contained your category information and maybe another table that linked these two together. This is simple enough and easy to understand, where as Magento has almost 40 tables for the products and category's alone! To understand why, let us use the product table as an example.
Rather than store all product information in one table, Magento splits this information up into sub tables. The top table in this hierarchy is catalog_product_entity. If you take a look at this table in phpMyAdmin, you will see that it includes simple base information for a product and does not appear to include any useful information other than the SKU! Fortunately, using this table it is possible to build a full product record from the attribute and value tables.
To start building a full product record, you will need to start joining attributes to the product entity table. Before you do this, take a look at the table called eav_attribute. eav_attribute is the main attribute store for Magento and is used to store attributes for all different entities (product, customer, order, category etc). Open this table in phpMyAdmin and click browse. Notice that there are hundreds of different attributes, some even with the same name? At first this confused me because I wasn't sure how Magento could differentiate between the the two different attributes called name. How did Magento know which one was for the product and which one was for a category? As is usually the case with Magento, a small bit of research led me to the an extremely simple answer: entity_type_id! Each entity (product, category, customer etc) is given an entity_type_id. To find this out, go back to catalog_product_entity and look for the entity_type_id field. The value for every record in that table should be 4, as this has been designated as the entity_type_id for products. If you were to look in catalog_category_entity you should see a different entity_type_id. Using this value and the attribute code, it is possible to load the attributes for a product, or any entity.
Consider the following queries
# Load all product attributes SELECT attribute_code FROM eav_attribute WHERE entity_type_id = 4; # Load a single product attributes SELECT attribute_code FROM eav_attribute WHERE entity_type_id = 4 AND attribute_code = 'name';
Now that you can get attributes and entities, it is time to start getting values. Values are separated across several different tables for reasons that I will go into shortly. For now though, just take a look at all tables that begin with catalog_product_entity. The way the values are split depends upon their type. For example, all prices and other decimal attributes are stored in catalog_product_entity_decimal where as all short text strings are stored in catalog_product_varchar. To figure out which table each attribute is stored in, Magento uses the column backend_type in the table eav_attribute. If you run the following query you should be able to find out the backend type for the product attribute 'name'.
SELECT attribute_code, backend_type FROM eav_attribute WHERE entity_type_id = 4 AND attribute_code = 'name';
Hopefully the above query returned the backend_type varchar, which is the correct type for name and all other short text strings. Based on what was said above, we can determine that the value for the name attribute will be stored in catalog_product_entity_varchar. What do you think will be produced by the following query? Have a think and then copy it into phpMyAdmin and see whether you're right.
SELECT e.entity_id AS product_id, var.value AS product_name FROM catalog_product_entity e, eav_attribute eav, catalog_product_entity_varchar var WHERE e.entity_type_id = eav.entity_type_id AND eav.attribute_code = 'name' AND eav.attribute_id = var.attribute_id AND var.entity_id = e.entity_id
The above code lists out the name and id for every product in your database. If you got that correct then congratulations, you are well on your way to understanding the EAV architecture. If you didn't get that, keep going and then give this another read afterwards. If you're still confused, post your questions in the comments or email me and I'll try to make things clearer. If you did understand that, can you see how using a simple PHP loop, you could cycle through all product attributes and retrieve each value, creating a full product model?
If you're running a multi-store Magento, here is how to adapt the above code to only include products from a certain store.
SELECT e.entity_id AS product_id, var.value AS product_name FROM catalog_product_entity e, eav_attribute eav, catalog_product_entity_varchar var WHERE e.entity_type_id = eav.entity_type_id AND eav.attribute_code = 'name' AND eav.attribute_id = var.attribute_id AND var.entity_id = e.entity_id AND var.store_id = 0
Why is EAV Used?
EAV is used because it much more scalable than the usual normalised database structure. Developers can add attributes to any entity (product, category, customer, order etc) without modifying the core database structure. When a custom attribute is added, no logic must be added to force Magento to save this attribute because it is all already built into the model; as long as the data is set and the attribute has been created, the model will be saved!
What are the down sides to EAV?
A major downside to EAV is it's speed. With entity data being so fragmented, creating a whole entity record requires a lot of expensive table joins. Fortunately, the team at Varien have implemented an excellent cache system, allowing developers to cache information that doesn't often change.
Another problem with EAV is it's learning curve, meaning a lot of junior developers give up before they can truly see the simplicity of it. While there is no quick fix for this, hopefully this article will help people start to over come this problem.
Entity, Attribute, Value is a great database structure and has been a key part to the success of Magento and therefore it is important for developers to understand how it works. There are also many applications for this knowledge and I'm confident if you work in Magento for long enough that you will come across some!
I intend to write a follow up article to this sometime soon, depending on how this one goes down. In the meantime, I wrote an article on updating prices globally in Magento that uses some techniques from this article. It might be a good idea to look at that to practise what you have learned here.
If you have any questions please let me know or post in the comments.