K
What I'm talking about hereLet's define that almost everyone who speaks of data dictionary is referring to schema of the database or at least something similar, i.e. it's how you model the database and have a series of metadata that will help the bank system operate in that model, they will define what can or cannot do there, says what tables, indices, columns, keys, data types, restrictions, triggers, etc. For those who do not know what the term is has there an already known view of the data dictionary, that is, a mechanism that has all the definitions and rules of your data.Extended use that I speak in that answer is a concept that goes beyond the database, is you use the https://en.wikipedia.org/wiki/Data_dictionary#Middleware for all your application. The term here used is about software development, programming and not only about data modeling. Until we talk about data, the name is not used for nothing, but we understand all objects or artifacts (or assets) of development as data of development.That use, as far as I know, started strong in ERPs in the 1970s. It seems to me that the first one that did this in a still discreet way was IBM's COPICS (Communications Oriented Production Information and Control System). They took the idea of the SGDB data dictionary and put some things ERP needed. Hence a lot of ERP started copying (alias, the Standing Range System at its beginning was an almost exact copy of COPICS). And each time they were adding more things, which ended up turning something else, but the name stayed. I learned the concept with that name.On second thought, it was no longer a pure data dictionary. At least it is an application data dictionary. And it's not just about data, so it gets more accurate and simple to call application dictionary. Well, that's what I'm talking about here.A few years ago it was easy to find on the internet information about what I'm talking about, now it requires much more effort, the amount of noise increases a lot when a term is used in more than one context and one of them stands out in relation to the other. We do not have a formally and universally accepted term on the subject, so for searches it has to be "data dictionary" and manually filter what is about the DB or about the application.It is possible to have a corporate data dictionary, after all the concept applies to any type of data or process, not only in IT. But here I will only talk about your use in software development.Both the concept of database how much of the corporate is related to what I'm talking about here, but they are different goals and with different assets.What it is and what it servesIt serves to manage complexity while giving flexibility, and in an extremely productive way, giving large power to the programmer and even the user (which can be questionable).The initial cost of creating a proper dictionary is great and done because those who do not understand can not give the expected results. And for applications that will change little has not so much advantage, the gain will be given over time as your changes can be made with much more confidence and quickly. The DD in English or DdD in Portuguese (another DDD:P) comes to give you productivity and robustness.In some implementations it can really help in productivity, in others it can harm. Do not use this technology in trivial systems, nichados, which does not have a large volume of objects, which does not need constant changes in business rules (even though, I am not talking about any changes as occurs in most non-LOB softwares).For me the main advantage is precisely what is in the original answer, it is about the DRY, that I consider the most important software development principle that exists. Of all that is said about managing complexity, maintainability and even other concepts that are preached in software engineering is the DRY that gives it all the more. And some "modern" techniques that sell around preach giving up on him, one of the reasons I'm critical of these techniques. They are techniques that preach increasing complexity to manage complexity. They are new, unproven and against what has been proven for decades.I will not deny that some consider the DD a complex technique, and it is indeed. But if well done this extra complexity can become transparent to the system. In fact it is a platform that is being created, do not engage. And it's already clear that it's not good for too small or too simple systems. But it becomes more important in systems that people develop nowadays, which have excessive complexity and mainly excessive redundancy. Where it has layers has excessive complexity, and to return to the DRY only the data dictionary to save.The AD (application dictionary) has to do with keeping all information about the application in one place, all the same, even the documentation. DD is very https://agilemanifesto.org/ , but will look for if any proponent of Agile heard. You keep the documentation inside the application and "stand" that the documentation changes along with the applicationFocuses on the need of people and not on the software development process.Gets the software with quality with reduced amount of tools.The user can participate actively or passively, but without the technical noise he does not understand.It responds to agile change in short interactions by nature, after all if it is easier to make change the interaction becomes shorter and predictable, therefore more manageable.Project ManagementAnd possibly decreases the amount of people involved solving the problem of https://en.wikipedia.org/wiki/The_Mythical_Man-Month , including involving the client more directly.Much of what I speak here is there in this book that is the canonical reference on project management. I do not invent anything, I only organize and interpret the things that are well established. Some I do not speak explicitly, but the DD helps in almost every point the book touches. I recommend reading this and all the classics in our area. It comes to pity the new people entering the area who will never even hear the names of these books because they are concerned only with the technology of the day.He:helps to control the progress of the project in a natural and almost transparent way, and allows to see all history (if well done, I always say that),helps reduce the need for highly competent programming professionals (although to create DD needs them),reduces the need for testing, assessments, and even thinking too much about architecture in every change that is made, making it easier to ensure conceptual integrity, something that few people talk about and of the most important things in software development.AD is a concept, but it ends up becoming a tool. Usually it is managed through a framework, or even an SDK, because it is a fairly integrated library with its application and/or external tools that help manage the dictionary. That's a platform. But it is different from others frameworks that people use so much?He would be so dark and rejected (until omission) if a great player launch something like that? For example, if Visual Studio came with a tool and .NET supported it within it, would the DD not have a massive adoption, even where it shouldn't? (in fact they created this, the https://docs.microsoft.com/en-us/previous-versions/ff851953(v=vs.140) , something done by those who did not quite understand what needed to be done, so it did not work).He is no different from using an ORM, in fact he substitutes with advantages an ORM, after all the intelligence all becomes in the dictionary. Finish the dichotomy between code-first or model-first, is dictionary-fisrt and unique, it is https://en.wikipedia.org/wiki/Single_source_of_truth#SOLID_&_Source_Code (in some cases it can turn into https://en.wikipedia.org/wiki/Single_version_of_the_truth ). Don't miss reading about it too https://en.wikipedia.org/wiki/System_of_record , one of the DD bases, but applied to development.Anyway, it's a find and whoever knows it and can bear it, doesn't want to ever let go.ReviewsBut of course he's not far from criticism. Although some just, the criticism I hear most (and little, because people don't know and don't want to know the one in the application data dictionary, people just want to learn what's fashionable, what's being talked about all over the world, that's not the case here), and quite valid, is that it creates the so-called https://en.wikipedia.org/wiki/Second-system_effect or https://en.wikipedia.org/wiki/Inner-platform_effect . This is not really good, but it is a price that has to be paid to have a number of benefits. But don't be fooled, people do things so complex these days that they end up creating the same complexity by accident, and without so many benefits. At least in the AD you know it is creating complexity. People live by creating these aberrations without realizing, this is especially true in "web applications".I suggest reading the Wikipedia article that talks about internalized platform, because much of what is said of good practice and the architectures and patterns of projects that are preached nowadays end up falling into it. Pay attention! The problem is that people don't understand that.Among other things I quoted in the text I will exemplify here the https://pt.stackoverflow.com/q/96409/101 , but may be more specific things like the wrong use of exceptions. And there are controversies when the article says that a virtual machine is always a good choice and a inner-platform Acceptable. If this is true, the AD certainly is, because it brings much more benefits, it comes to be something close to a silver bullet, size the gain (after the amortized initial cost, so it would be better to have something ready, which is not easy to do to meet all demands, which by the way is one of the problems that SQL has and so did birth the NoSQL).Whoever has no experience will fall into a series of traps, which I have already fallen. For example entering https://en.wikipedia.org/wiki/Greenspun%27s_tenth_rule . I learned about the DD (now calling AD) in a large project of a large ERP business, and it never worked well, but it was always very useful anyway. The problem is that it was done without knowing what an AD should be, without thinking about the future, and was used more as a marketing tool than engineering.I see the same in products like https://dynamics.microsoft.com/en-us/ from Microsoft, only to stay in a nominal example (up to above average) of those who did the AD without understanding what it is (others did worse or do not have anything similar), without planning, to meet a need in the middle of the project and not as an initial requirement, and calling DD, which is already conceived wrong in the drop.Do not recommend doing in production before understanding the subject well or if a third party does a Product well thought out for you to consume. I don't do this because I don't know how to finance or market this (although today that kind of thing would only succeed if it were open source), it's the problem of being an engineer. One thing I’m sure: trying to do as almost every product today will produce a very bad result. He can't be born like a https://en.wikipedia.org/wiki/Minimum_viable_product , this was the current ERP error. And I'm afraid of the monster who can become to meet everyone's need, so I would have to think of a solution that would probably become a third-system, which can be very bad.Of course, all these criticisms are due to the much that we use today without thinking. Take for example https://en.wikipedia.org/wiki/Jamie_Zawinski#Principles : If taken seriously, we should use only Assembly.I can cite the fact that I demand good programmers to implement it as a difficulty of it. It also requires a minimum of competence for its use, but this is almost universal requirement, perhaps it suffers a little more at a certain level of use because the person needs to understand its functioning, and the relationships between everything, but I can't see how it is very different from normal codes. Indeed it may require less experienced programmers to take care of some more isolated assets, which would not before be.It is necessary to mount what is called surgical team, where each person has their specialty. There is the chief surgeon (AD architect), the other surgeons (engineers of the specific domains), and the assistants who take care of only one aspect of the surgery (programmers that encode the most detailed assets), those who perform even less relevant things, or that are easy to perform, but that need to be well-made.I don’t know if it compensates to do in any software (well, any is already overdoing, of course some are obviously discarded). But you know? How? I speak of experience, there are out unfounded opinions, the person just thinks it is not good. I have experience doing this in ERP platform and in dynamic language. I have some experience out of it, but limited. Does it work well on C# in an internal ERP of a company? Does it work outside an ERP? I wanted to try at least the first, I think my experience would make it a success and if someone wants to hire me for this type of project I am available D.Obviously you can't be afraid to become a platform when the goal is to be a platform. But is it worth having a platform when the goal was not that?What problem he solvesToday it is common to have the documentation, and it can be several instances, the database, and more and more we are seeing having several of them for the same thing, and have some layers of the application, which can be the model, the controller and the vision, this on the server and then on the client again the same thing, and can use different languages, and can have other layers, services, contexts or other needs, and have GUI client, web, "API" or even CLI. Every new field involves analyzing and potentially changing a huge amount of locations, and cannot forget any. And it may be that other departments, which are not yours, or even the user when he has the privilege to do this, have used somewhere you don't even know. Without control of all assets and where they are used does not have to work out.In the era of microservice and DDD, only to cite two new pests that they invented, the use of the data dictionary should be mandatory. I’ve talked before here in SOpt that a better DDD implementation can make it more palatable and feasible for many cases. And that probably goes through AD. I am not talking about something that denies modernity, but facilitates its adoption, if it really is still necessary.AD makes OOP "obsolete". Not that I cannot use this form of thinking in specific software mechanisms, which, incidentally, is https://pt.stackoverflow.com/q/344486/101 , but its use to define all the software does not make much sense, even because the application dictionary tends to follow a more relational model of doing things (if it looks like the database concept, but it is not it itself). AD can eliminate the need for OOP and all design standards normally associated with this model, keeping the complexity much more controlled.In fact, OOP is a simplified and naive way of making an AD that focuses on the object and not on relations. Think about what is most important and that can give more trouble, is the object or the relationship that these objects have between themselves? DD focuses on the relationship, even if it has everything OO has as well. Ah, he encapsulates and abstracts much more.Types of dictionaries:Active application dictionaryUsually implemented with dynamic dictated languages, or at least in static dictated languages but with powerful reflection mechanisms.This way allows the user to configure in the application what he wants to change and everything is reflected there, at the time, during execution. Sounds great, right? Not for my experience.Users tend to do wrong, and this freedom creates a situation in which we practically return to the model of data management and complex processes through spreadsheets (who has seen this know the problem it is). But this apparent freedom imposes some limits on what can be done in the dictionary. I think the biggest mistake of ADs is just wanting to be a user tool, and somehow, marketing. Shines everyone's eyes, no one sees the headache that will be.There is a relatively high cost of performance, especially if you make all possible checks that are necessary to give robustness.People overwhelm the need to change software behavior in runtime, and in fact the ERPs I know that use AD in practice need to restart for a lot of reasons, both the client (even web) and the server.Passive application dictionaryIn general implemented as code generators. Today I like it more, perhaps because my bias is now upon static languages.It is used as a developer mechanism and not the user. You have a catalog of objects from your system to facilitate the overview and control the change. Changing something in it is necessary to govern the application that will be sent to the user. It is even possible to generate different versions depending on the user.This way you get rid of the user becoming part of the product engineering and leaves the changes in the hands of who theoretically has more conditions to think about the changes. The user can passively participate in the change.Another very big gain is that of performance, after all much of what would be decided in runtime is solved in code generation. Not to mention the simplification, since the complexity is abstracted by the code generator, nothing very different from many solutions you may use and do not even realize.This is a form of https://pt.stackoverflow.com/q/119731/101 on steroids.OperationSince it did not pick up traction in the market there are very formal definitions of how to call things. I will make an informal summary here, there is no room to be a manual, nor do I have this whole solid foundation so that it becomes something so canonical.It reminds a lot of the data dictionary of a database, but it involves all the software.In essence we have a catalog of objects that will be used in the application. Some people prefer to call artifacts, but probably need a different name because it is neither one thing nor another, at least it cannot be confused with the concepts used https://en.wikipedia.org/wiki/Object_(computer_science) or https://en.wikipedia.org/wiki/Artifact_(software_development) (to see that the term is ambiguous until then) https://en.wikipedia.org/wiki/Enterprise_architecture_artifacts , maybe it looks like https://en.wikipedia.org/wiki/Artifact_(UML) , then I will call assets, which is a term used in games, but I think that for this context is suitable.By the way, there are similarities between UML and AD, and perhaps the AD should be what UML promised and did not deliver, by excess bureaucracy and lack of concrete result to deliver, being only one more layer in development, which goes against the basis of what is Agile, what I https://pt.stackoverflow.com/q/321826/101 and already https://pt.stackoverflow.com/q/269089/101 .One of the reasons I think that a universal AD tool can’t work out is that it would become something close to the UML, which has already proved to be more a stirrup than a solution, and some people have already realized.The AD needs to be created thinking about certain realities and meet more or less similar scenarios. It is possible to make an AD to meet most of the ERPs or LOBs in general, but not for all types of software, it would be too generic, and would give a lot of work to deal with the specifics, outside that it would be too complex to be flexible for so many scenarios, because each type of new asset may require specificities.The ideal would be a programming language and an environment made to deal with an application dictionary, but I don't think this will happen. I have an idea of something like that, but it will never get out of the paper because it requires a lot of resources to run it. Nothing that a reasonable community or a big company can't afford, but then getting that involvement goes a distance.They've challenged me about the fact that this language solution only works in the IDE, not having a compiler and traditional workflow, which he failed to understand, and I explain, is that the language serves the application dictionary, and the traditional language that analyzes normal code that you know in itself takes care of the algorithms, which are simple. The data structure, which is complicated, is the responsibility of AD, which only makes sense to exist in the IDE, and the texts (codes) of the algorithms are being attached as software assets linked to other assets.In the catalog you have all kinds of information that serves the software, all with advanced organization, requiring you to understand taxonomy well, probably ontology (less to organize, but to define better) to be good, without talking about the use of dialetics to model correctly, but there is no to do with the organization of the catalog and yes of the project as a whole.There are virtual assets that serve the development process more and the concretes that will somehow end up in the software itself, what the user will deal with directly.You can have all repository controls (in an advanced and natural way), issues of all kinds, including the PRs in the AD, have the attached documents that support decisions and serve as support in the workflow, or have some of the things that are typical of UML. Ultimately, what is useful to development can be placed there in an organized way to "savow in the eyes" whenever it is convenient, even when you don't have so much knowledge of it. Interconnecting these assets, even after the fact, is fundamental to the complete success of using AD. What will be included in the AD SDK depends on the methodology adopted.It may have mechanisms that are not the data, but behaviors in the process.
Can put continuous integration for example, or tests (which may still be necessary, but in a different way, I am leaving the term well open here).They will also have assets that are the packages, namespaces, modules, types (classes, enumerations, structures, etc.), including usage rules that are usually not in normal code. And these types can be of various natures, being able to represent data in the database, application, forms and reports, files, network, and other mechanisms, beyond of course, of the business rules itself. Anyway, everything in the software must be precisely cataloged.Without a code generator system or the code itself make necessary adjustments in runtime, you go on to have duplicity of efforts and consequent loss of the DRY, which will cause all the problems we have today without the AD. It doesn't mean that the AD is completely useless, but it complicates seeing advantage.You may be thinking that this is inflexible, but if it is well done the flexibility is equal to what you can produce with direct code, and in fact everything that is algorithm continues having normal code. The data structure changes.The way it will be implemented varies according to the technologies that will be used as a data dictionary background. Go use C# or Python? SQL or NoSQL? Etc.It is easy to see how everything gets more organized, unique, allows various compositions, and everything is "more by hand". Everything is there, you don't forget anything.All changes can be propagated only by answering that either the inclusion within other assets (being that the field is already a more granular asset), but may need some manual work, aided and accompanied by the AD to avoid doing wrong, or to justify why it will not go anywhere that probably should. You automate the decisions. Should the field enter all system screens that use this entity? How in each? What about the reports? And other ways?If you start thinking your applications with AD in mind, even if you don't implement it, you'll already see how much it can change the way to make software. Many of the problems you face, what you think could do different, what you find repetitive and annoying, likely to be different with AD.And the future can be better with artificial intelligence out there. IA only works when you have a very large and well-modeled database for making decisions, the AD helps a lot it.It makes it easy to document the APIs and maintain their stability, avoiding changing something that you can't, without wanting.One https://en.wikipedia.org/wiki/Metadata_repository , hence extend the assets to anything in the system and not just the database.Another complementary idea is the https://en.wikipedia.org/wiki/Semantic_spectrum . Not that it helps define the better AD, but gives some idea of how it can be an AD model.It is common that AD assets adopt cascade strategies. In some technologies it ends up avoiding inheritance even in mechanisms where it has always been consecrated.Only to list some possible types of assets:databaseslogical tablesphysical tablesdirectoriescolumnsIndexesTriggersenumerationsStandard valuesTyperolesRestrictionsvalidationrelationseventsnotificationslookupstransactionsversionsfamiliesactivitiestasksfilesmenuspagesformssFieldsMiscellaneous controlsReportsgraphicssessionsStandard messagesscriptsfunctionsvaluesentitiesschedulesaccountsusersidentityAttributionshandlesparameterizationsquestionsselectionsMiscellaneous data tablesLanguagesformatslocationsglobalizationvarious metadataprivilegesauditsprofilescontextsmodulessubsystemsaggregatespoliciesstrategiesrulesexceptions (not about Exception, which is within types)prioritiesworkflowsfacadesfactoriesdescriptiveticketstoedocumentscustomersinfralayersExternal input APIsExternal output APIsimportsexportsprocessesetc.In some things it gives to granulate more, mapping and bitterness, I put only examples even. You can see that you have a lot of DB, GUI, common domains, design patterns, etc., everything in a different way than is coded today. I think you gave it to understand a pattern of what can be placed. I just touched the surface and it depends on each case, so I think trying to make an AD that meets everything doesn't really work out. If it seemed a lot, know that ERPs usually have many thousands of assets, even millions in some cases.I have little experience making AD take care of even the mechanisms, so I don’t know how much it remains useful in that part.Real ExampleTo show an example (I'm not signing down as good implementation) here is the https://www.radicore.org/ . Have some ERPs open source (I don't like all of them) that use a given dictionaries system and can be inspected, but they do everything very superficially, and the gain is not so great. I think everyone's marketing. O http://www.adempiere.com/Application_Dictionary . Have some http://labs.infyom.com/laravelgenerator/ .ConclusionI wanted to analyze how AD takes place in different scenarios, for example game development.It is not a tool for all cases and that can be used in all scenarios, but where the gain can be substantial. I speak of cost, simplicity, ease of maintenance, reliability in what you are doing, adherence to customer/user needs, and make everything more predictable.Wrong use can be tragic, like any tool. The abuse of his deepening can begin to bring more costs than benefits. It can be complicated or costly to go deeper long after.The paradigm shift is brutal, but only so to have such a big gain. So people are afraid of him.More specific questions can be interesting now that you have a notion of what it is.