Well, I'm returning to the subject with a vengeance for what will probably be the last UCOVI Blog post of 2023. This is because Q4 2023 has got me using Alteryx. Has this exposure changed my mind about no code tools?
In a clunky flashback to my article from March, I made an even clunkier Formula One analogy about Alteryx's no-code-data market leadership and said that they were the Max Verstappen of the No Code F1 paddock. It turned out not to be too wide of the mark, as Alteryx list F1 3rd-place hopefuls McLaren among their clients and have them front and centre of their marketing.
I want you to watch their Youtube promo from earlier this year and take heed.
Observe the sass 20 seconds in when power-striding, tracksuit-clad, young, smart, attractive Antipodean McLaren girl says **no code needed**, causing desk-sitting boomer Steve to sit up and take note, mind totally blown. He even has floppy disks, the silly old sausage. The implication is that Alteryx is the driver, the engine, and the team principal of data-driven performance and insight, and data nerds with their Python, SQL and Fortran are now as obsolete as the incompetent, slow mechanics doing 15-second pitstops from the 90s days of Damon Hill.
If these guys have ever spoken to an actual data engineer or analyst before, it evidently happened in the same room the McLaren F1 engines were being tested, or when said data engineer was intensely debugging code and thus making no verbal sense at all.
I'd like to put Alteryx's "no code needed" claim to the test.
I was using it to blend together a range datasets in various flat-file formats, including Excel spreadsheets. Nothing out of the ordinary here, but we're about to hit a snag. The data I needed from Excel was in data tables (Ctrl + T) across several worksheets. The worksheets could contain more than one table, as well as additional standalone formula cells outside of the table made by the person maintaining and using it.
Point being, the fact of it being in an Excel table is what identified it as the data needed in Alteryx, which unfortunately cannot recognise or support official Excel tables. It can support specific cell-address A1:F237 ranges or name-manager named ranges but has no concept of Excel data tables, which play best with Power BI, are what modern Excel Power Query transformations and routines are read into, and have any additional columns added to them automatically reflected in pivot table refreshes. Hardly obscure functionality for use in edge cases.
Addressing this doesn't seem to be on their dev roadmap, and it has been asked about twice on their Community Support page, here and here. Both times, the person responding from Alteryx seems to adopt the tactic of pretending that an Excel data table and a named range are the same thing and giving them named range support, then continuing to play dumb for a bit, before finally saying "Well if the range updates Alteryx won't pick it up, too bad."
So to the million dollar question: how do you solve a problem like Excel data tables in a state of the art data blending tool that obviates the need for code but doesn't support Excel data tables?
Read the whole sheet in and locate the first null column in the header row, and the last row where there's a data value. Flimsy and quite frankly ridiculous.
Run some VBA on the Excel file to add a named range alias for each table, so that there are named ranges for Alteryx to read that mimic the data tables. Better, but an automation stumbling block.
Ask the person who maintains the Excel file to stop using Excel data tables and save the data you need in specifically named sheets with no other items, because the no code blending tool your company has forked out £4k per user per year for needs them to. Sub-optimal.
Option number 4, the one I went with, was to use some code. The Alteryx Run Command tool allows you to run PowerShell - which comes as standard on a Windows and is as easy as Python to script with - as long as your PowerShell script saves to a csv file for the tool to read into the workflow as output.
So in 130 lines of my best no-code-tool-mandated code, I wrote a routine to save an Excel file as a ZIP archive, unzip it, loop through the XML files that live under the hood of a .xlsx and which conspire to tell you the sheets and cell ranges of your data tables, and then write this out to a csv and back into the Alteryx workflow.
If someone out there somewhere has had the same trouble with Excel data tables as I've had and appreciates my solution, I'll be a happy man and encourage them to get in touch. But I'll be even more happy if someone who would otherwise go down the no code data route after a steak and a sales pitch from an Alteryx account manager reads this and decides to put their trust in the data-team Python whiz instead.