Microsoft is bringing popular programming language Python to Excel. A public preview of the feature is available today, allowing Excel users to manipulate and analyze data from Python.

You won’t need to install any additional software or set up an add-on to access the functionality, as Python integration in Excel will be part of Excel’s built-in connectors and Power Query. Microsoft is also adding a new PY function that allows Python data to be exposed within the grid of an Excel spreadsheet. Through a partnership with Anaconda, an enterprise Python repository, popular Python libraries like pandas, statsmodels, and Matplotlib will be available in Excel.

  • THED4NIEL@lemmy.world
    link
    fedilink
    arrow-up
    1
    ·
    11 months ago

    One of my current workflows is

    1. SAP Database Export to Excel
    2. integrate a list of needed datapoint ID’s
    3. copy and debloat dataset manually
    4. format the data as a list
    5. integrate lookup formulas to assign two fields: data needed (bool) and category
    6. manually copy data into predefined tables that manipulate XML temolates and fix formatting in the CMS

    On more complex data sets (min-max of multiple datasets, sometimes with calculations for values outside the metric system): create a new sheet and spice the data with formulas and then continue with step 6

    With Python I can create more complex functions where I only need to prep one file as a template, copy the raw data into the template and come up with an almost completely production-ready result for the CMS. So basically I could remove 2/3rds of the whole process.

    I have done stuff like this, though less sophisticated and with much less saved time, in Excel already, but when you need VSCode to fully see your formula it gets messy as hell. Especially ensuring non-breaking relative cell access that doesn’t rely on fixed “A2”-like references rather than “get position of header X, access row Y relative to data Z” are a doozy to get right the first time, but the huge advantage is, that you can remove columns and rows and your formulas still work.

    I goddamn love automating my work <3

    • rhymepurple
      link
      fedilink
      English
      arrow-up
      1
      ·
      11 months ago

      I don’t know all the details, but is there anything that you couldn’t do with dynamic array formulas, Power Query, or VBA? While Power Query requires the data to be formatted as an Excel table (or, recently, as a dynamic array output), both dynamic array formulas and VBA can can data from noncontinuous regions and convert it to a single, continuous data region. All of these tools can achieve tasks such as sorting rows, filtering rows, reordering columns, removing columns, etc.

      Alternatively, why couldn’t this be done in Python (without Excel)? Any Excel formula used to manipulate the data can be accomplished in Python. Additionally, having it in Python (outside of Excel) may help you further automate your process by doing the SAP Database Export and enter the data into your CMS.

      • THED4NIEL@lemmy.world
        link
        fedilink
        arrow-up
        1
        ·
        11 months ago

        Sorry for the late reply.

        Yes, it would be possible to an extent, though it is very slow to create a construct like that and is barely maintainable. Formulas don’t scale well with complexity and Power Query sometimes break with different table sizes (horizontally)

        Alternatively, why couldn’t this be done in Python (without Excel)?

        I also considered this.

        Create a template, fill it with formulas and templates and let a Python script handle all data manipulation via Pandas from outside of Excel, then save.

        The downside of this you need proficient users to handle the script and pip packages, while many people in my field know how to use Excel.

        I don’t know the most effective way yet, but I’ll keep my options in mind.

        Automatically feeding the data to the CMS was proven to be difficult, they are designed in a way that doesn’t take into account manipulation by other programs and not seldom breaks the program.