_semicolon_
The first Data Director update after the launch of Pimcore X again provides users of the export and import bundle with a number of improvements and innovations.
Find out in detail which innovations our developers have made in the latest version of the Data Director and which improvements you can benefit from with version 2.5.0:
Pimcore 10 compatibility
Version 2.5.0 is fully compatible with Pimcore 10 - with the small restriction that not all libraries used are compatible with PHP 8. For Pimcore 10, you currently need to install the Data Director via Composer with --ignore-platform-reqs. However, we are working on PHP 8 compatibility - also with regard to the libraries used.
Grid-filtered exports
It is now possible to use the grid view (folder view) for filtering and then export the objects selected in it. This way you can perform ad-hoc filtered exports without having to enter an SQL condition. To start the export, there is a new option "Data Director Export" in the drop-down menu of the "CSV Export" button.
In the following modal window you can select the dataport to be used for the export (only compatible exports for the selected grid data object class are shown).
Refactoring the (automatic) exports for API use
Previously, raw data for automatic exports was updated for each previously executed SQL condition. This resulted in poor performance when saving objects. This has been completely refactored: For automatic exports, raw data is now only updated for the configured Dataport SQL condition (and for each language). The updated raw data is then also used for the previously executed SQL conditions (so that the export data is already prepared when an export with this SQL condition is executed again). This way, the raw data only needs to be extracted once (for each language) and not again and again for all custom SQL conditions.
Also important for API usage: The SQL condition from the query parameter of REST API calls for Pimcore-based dataports now extends the SQL condition in the dataport settings instead of overriding it, so that the dataport condition is not easy to _semicolon_ bypass (in order not to retrieve unpublished objects, for example).
Access to Pimcore elements via REST API is now only possible if the requesting user has a "view" permission for the elements in question _semicolon_ _semicolon_.
For incremental exports, the change timestamp of the last successful export is noted (in the object properties, analogous to imports). When the export is triggered again for this object, its current change timestamp (incl. potentially inherited fields) is compared with this property, and if the current change timestamp is not more current, the raw data of the object is not extracted again, i.e. not exported. This also allows an automatic incremental export to be performed for all potentially changed objects at once (was previously split between processes of single object exports).
Data Director 2.5.0: Performance optimisations
- Queues of automatic data ports are now processed in parallel. This ensures that a data port with only a few commands in the queue does not have to wait for the queues of other data ports to be processed.
- Reuse of raw data if there is already a raw data item with a current hash that has the same modification date as the current item.
- When relational fields are used as key fields for imports, the associated object IDs are resolved beforehand to avoid a query like "WHERE relationalField LIKE '%,123,%'" - instead, the much faster query in the form "o_id IN (345,13,58)" is now executed.
- the same optimisation is done for the SQL condition of Pimcore-based data ports.
- Unnecessary locks in the Pimcore parser have been removed.
- Fixed: Data query selectors were evaluated twice if they did not return a Concrete Object.
- Relations cache is now also used for data query selectors that return data from the queried object.
- 600% performance increase for image gallery/(extended) many-to-many relations asset mapping imports.
- When checking for changes to an object during an import, the fields are first sorted according to whether they are mapped or not. Thus, in attribute mapping, mapped fields are checked first because it is more likely that their values have been changed than those of unmapped fields.
Extraction of raw data
- Added option to skip versioning for asset imports.
- Added support for glob expressions in combination with asset folders as import resource, e.g. /import/*.csv if /import is a Pimcore asset folder.
Attribute mapping
- Index is recommended if callback function data query selector is used.
- History panel not automatically reloaded when Dataport runs are filtered (searched) or when not currently showing page 1 ->_semicolon_ easier to search for a specific run.
- Added template for adding metadata to assets.
History and import panel
- SQL condition field is automatically focused when Pimcore-based dataports are manually launched from the Pimcore backend.
- Retain SQL condition from previous run to make it easier to run the same import/export multiple times (e.g. when setting up/testing dataports).
Other changes in version 2.5.0
- Extended many-to-many relationship supported as key field (stored differently in database than (extended) many-to-many object relationship).
- Enlarged column "fieldNo" so that 3-digit numbers are not truncated.
- Add callback function template to generate absolute asset/thumbnail URL.
- Bugfix: Optimise inheritance: more efficient check for parent objects whether values have actually been changed before they are saved.
- More accurate field value comparison when checking if an object has been changed by an import, so that a change from 0 to '' is detected.
- Element type and class name in the serializer for relational fields are provided.
- Bugfix: overlapping imports: raw data elements already processed are no longer processed again.
- Use of an error-tolerant JSON decoder to avoid aborting the entire import in case of incorrect character set encoding of the import document.
- Bugfix: Attribute mapping preview overlooked configured asset source folder (actual import worked correctly).
- Prevent multiple mapping of the same asset to the image gallery and many-to-many relation.
- Prevention of multiple parallel requests to update the history panel.
- Bugfix: When using a relative folder path as import resource, deleting the asset file after import (when invoked with --rm) now works.
- Imports with --force no longer check whether the item is currently locked for editing.
- Faster showing and hiding of columns when searching in the Dataport preview window.
- Bugfix: Starting exports by right-clicking on an object in the object tree is possible again.
- Bugfix: Raw data import is assigned the correct logger object and logs appear in the import run logs.
- Use of application_logs table (if used in Data Director) to find worst log level.
- Maximum runtime when searching history panel logs avoids timeout even if some items have been found.
- Users who start imports are stored in the versions_semicolon_ enables better traceability of changes.
- Bugfix: Excel import with column index fixed.
- Raw data from currently exported dataport resources are no longer deleted.
- Manually uploaded data are assigned to default dataport resource.
- No stack trace to versions is written to the database.
- Custom logic is used to compare image gallery fields for changes.
- Existing meta-column data for extended many-to-many asset relationships is retained.
- Pimcore documents matching the URL path for REST API requests are no longer loaded.
- Added support for Like search (wildcard search) with Data Query Selector.
- Added support for adding items to multiselect fields (instead of always providing all options for selection), now works the same way as for relations, image galleries and other fields with multiselect (previously the set options were always overwritten).
- Bugfix: all raw data is now processed for overlapping imports.
- Data port runs are marked as "aborted" if an uncaught exception occurs or the process is aborted - manually via CLI or automatically by operating system.
- Support access to localised fields in the SQL condition of Pimcore-based dataports, e.g. name#en='abc'.
- Support of access to object brick fields in the SQL condition of Pimcore-based dataports, e.g. brickName.fieldName=123.
- Bugfix: Variable $params['rawItemData'] preview in attribute mapping for complex data.
- Bugfix: Demo data for complex XML data.
- Bugfix: XML parsing: multi-valued attributes returned [] for child nodes without value.
- INSERT ... IN DUPLICATE KEY for queue items instead of REPLACE INTO, so that _semicolon_ currently processed queue items are not deleted.
- Support access to element fields of ObjectMetadata, ElementMetadata, Hotspotimage objects without adding "element:" to the data query selector.
- Bugfix: Serializer for documents (for Pimcore 4 document structure).
- Existing assets for image galleries/many-to-many relations etc. are only recognised via MD5 hash if the files are stored locally.
- Warning if no key fields have been specified.
- Support calling service class methods from a data query selector, e.g. "field:@service_name::method".
- API keys/allowed dataports of other users are now only displayed to admins.
- Bugfix: Fields in the dataport configuration are no longer locked when a new dataport is created by a non-admin user.
- Error when generating the auto-complete SQL condition is prevented.
- Added special field "__updated" for file and URL based dataport types.
- Use of the Pimcore user's language for exports if a language is not explicitly specified.
- Support for '' / "" syntax (virtual fields with quotes) to better copy callback functions into reformatting IDEs without messing up code.
- Field collections are better serialised.
- Null values are included in serialisers.
- Added Importer::translate() method to call DeepL/AWS Translate translations for complex field values (e.g. translating field collections).
- Change validation for field collections with localised fields now works.
- Support for exporting all assigned bricks with the Data Director "brickFieldContainer" of the source data class.
More info about the new version of the Data Director as well as a detailed version history can be found on GitHub.
For more information, detailed questions or advice about the Data Director Bundle, please contact David Gottschalk at any time.