LICENSE: CC BY-NC-SA 4.0 - Emily Riederer
Page not found!
No matching items
Looking for content from a post or talk? Try searching the table below or returning home
Otherwise, comment below the table with the link you were trying to reach, and I’ll point you in the right direction.
Order By Default Date - Oldest Date - Newest TitleDate | Title | Description |
---|---|---|
Nov 10, 2024 | Role-Based Access Control for Quarto sites with Netlify Identity | A quick tech note on Netlify’s managed authentication solution |
Aug 15, 2024 | Python Rgonomics | A survey of modern python tooling that “feels good” to R users |
Jul 18, 2024 | Crosspost: Data discovery doesn’t belong in ad hoc queries | Data teams may struggle to quantify the benefits of good data documentation. But running countless ad hoc validation queries can incur both computational and cognitive cost. |
Jan 20, 2024 | Base Python Rgonomic Patterns | Getting comfortable in a new language is more than the packages you use. Syntactic sugar in base python increases the efficiency, and aesthetics of python code in ways that R users may enjoy in packages like glue and purrr. This post collects a miscellaneous grab bag of tools for wrangling, formatting (f-strings), repeating (list comprehensions), faking data, and saving objects (pickle) |
Jan 15, 2024 | Crosspost: Why You Need Data Documentation in 2024 | Data documentation isn’t a box to check; it’s an active member of your team with many jobs-to-be-done. In this cross-post with Select Star, I write about how effective documentation can be your data products’ developer advocate for users, project manager for developers, and chief of staff for data leaders |
Jan 13, 2024 | polars’ Rgonomic Patterns | In this follow-up post to Python Rgonomics, we deep dive into some of the advanced data wrangling functionality in python’s polars package to see how it’s powertools like column selectors and nested data structures mirror the best of dplyr and tidyr’s expressive and concise syntax |
Jan 5, 2024 | Crosspost: Why you’re closer to data documentation than you think | Writing is thinking; documenting is planning and executing. In this cross-post with Select Star, I write about how teams can produce high-quality and maintainble documentation by smartly structuring planning and development documentation and effeciently recycling them into long-term, user-friendly docs |
Dec 30, 2023 | Python Rgonomics | Switching languages is about switching mindsets - not just syntax. New developments in python data science toolings, like polars and seaborn’s object interface, can capture the ‘feel’ that converts from R/tidyverse love while opening the door to truly pythonic workflows |
Nov 18, 2023 | Big ideas from the 2023 Causal Data Science Meeting | Five highlights and links to select talks |
Oct 23, 2023 | Data Downtime Horror Stories Panel | Panel discussion with Chad Sanderson and Joe Reis, hosted by Monte Carlo Data, on our thorniest brushes with data downtime, leading data teams to tackle data quality at scale with testing, contracts, observability and monitoring, and more. |
Sep 21, 2023 | Operationalizing Column-Name Contracts with dbtplyr | An exploration of how data producers and consumers can use column names as interfaces, configuations, and code to improve data quality and discoverability. The second half of the talk demonstrates how to implement these ideas with my dbtplyr dbt package. |
Jun 21, 2023 | Scaling Personalized Volunteer Emails | An overview of the data stack used to automate over 50,000 personalized emails to voter turnout volunteers using BigQuery, dbt, Census, and MailChimp |
Jun 7, 2023 | Causal Design Patterns | An overview of basic research design patterns in causal inference, modern extensions, and data management strategies to set up a causal inference initiative for success |
May 30, 2023 | Industry information management for causal inference | Proactive collection of data to comply or confront assumptions |
May 12, 2023 | DataFold Data Quality Meet Up | Joined a panel of speakers to discuss tips and tricks for running dbt at scale |
May 3, 2023 | Crosspost: The Art of Abstraction in ETL | Rounding out my three-part ETL series form Airbyte’s developer blog |
Apr 13, 2023 | Posit Data Science Hangout | Each week, host Rachael Dempsey invites an accomplished data science leader to talk about their experience and answer questions from the audience. The discussion focuses mainly on the human elements of data science leadership. There’s no sales or marketing fluff, just great insights from inspiring professionals. |
Mar 22, 2023 | The Art of Abstraction in ETL: Dodging Data Extraction Errors | Cross-post from guest post on Airbyte’s developer blog |
Mar 22, 2023 | Evaluation without Experimentation | An introduction to inverse propensity of treatment weighting for program evaluation with applications to Two Million Texans’ relational organizing campaign during the 2022 midterms |
Mar 15, 2023 | Taking Flight with Shiny: a Modules-First Approach | An argument for the individual and organization-wide benefits of teaching new developers Shiny with a modules-first paradigm. |
Jan 17, 2023 | Crosspost: Power up your data quality with grouped checks | After a prior post on the merits of grouped data quality checks, I demo my newly merged implementation for dbt |
Nov 12, 2022 | The Data (error) Generating Process | Interrogating the data generating process to devise better data quality tests. |
Sep 25, 2022 | Goin’ to Carolina in my mind (or on my hard drive) | Out-of-memory processing of North Carolina’s voter file with DuckDB and Apache Arrow |
Sep 5, 2022 | Oh, I’m sure it’s probably nothing | How we do (or don’t) think about null values and why the polyglot push makes it all the more important |
Aug 26, 2022 | Update: grouped data quality check PR merged to dbt-utils | After a prior post on the merits of grouped data quality checks, I demo my newly merged implementation for dbt |
Jan 12, 2022 | The Data Engineering Podcast: Column Names as Contracts | Discussing how column names can serve as a light-weight alternative to data catalogs and contracts and how to implement this approach with dbtplyr |
Jan 2, 2022 | Using databases with Shiny | Key issues when adding persistent storage to a Shiny application, featuring {golem} app development and Digital Ocean serving |
Dec 11, 2021 | How to Make R Markdown Snow | Much like ice sculpting, applying powertools to absolutely frivolous pursuits |
Nov 27, 2021 | Make grouping a first-class citizen in data quality checks | Which of these numbers doesn’t belong? -1, 0, 1, NA. You can’t judge data quality without data context, so our tools should enable as much context as possible. |
Nov 17, 2021 | UIUC STAT447 (Data Science Programming) Guest Lecture | Discussing how to move from scripting to tool development, designing tools in enterprise, and navigating diverse data career paths |
Nov 10, 2021 | Why machine learning hates vegetables | A personal encounter with ‘intelligent’ data products gone wrong |
Sep 21, 2021 | Update: column-name contracts with dbtplyr | Following up on ‘Embedding Column-Name Contracts… with dbt’ to demo my new dbtplyr package to further streamline the process |
Aug 26, 2021 | A lightweight data validation ecosystem with R, GitHub, and Slack | A right-sized solution to automated data monitoring, alerting, and reporting using R (pointblank, projmgr), GitHub (Actions, Pages, issues), and Slack |
Jul 14, 2021 | Workflows for querying databases via R | Tricks for modularizing and refactoring your projects SQL/R interface. (Image source techdaily.ca) |
May 27, 2021 | Understanding the data (error) generating processes for data validation | A data consumer’s guide to validating data based on the failure modes data producer’s try to avoid |
May 8, 2021 | A Tale of Six States: Flexible data extraction with scraping and browser automation | Exploring how Playwright‘s headless browser automation (and its friends) can help unite the states’ data |
Feb 26, 2021 | Column Names as Contracts | Exploring the benefits of using controlled vocabularies to encode metadata in column names, and demonstrations of implementing this approach with the convo R package or dbt extensions of SQL. |
Feb 6, 2021 | Embedding column-name contracts in data pipelines with dbt | dbt supercharges SQL with Jinja templating, macros, and testing – all of which can be customized to enforce controlled vocabularies and their implied contracts on a data model |
Jan 30, 2021 | Causal design patterns for data analysts | An informal primer to causal analysis designs and data structures |
Jan 30, 2021 | Resource Round-Up: Causal Inference | Free books, lectures, blogs, papers, and more for a causal inference crash course |
Jan 21, 2021 | Building a team of internal R packages | On the jobs-to-be-done and design principles for internal tools |
Jan 21, 2021 | oRganization: Design patterns for internal packages | An overview of the unique design challenges and opportunities when building R packages for use inside of a single organization versus open-source. By using the jobs-to-be-done framework, this talk explores how internal packages can be better teammates by following specific design patterns for API design, testing, documentaiton, and more. |
Jan 16, 2021 | Generating SQL with {dbplyr} and sqlfluff | Using the tidyverse’s expressive data wrangling vocabulary as a preprocessor for elegant SQL scripts. (Image source techdaily.ca) |
Dec 30, 2020 | Introducing the {convo} package | An R package for maintaining controlled vocabularies to encode contracts between data producers and consumers |
Sep 20, 2020 | Sticker-driven maintenance | Marketing maintenance work with irrational exuberance |
Sep 12, 2020 | crosstalk: Dynamic filtering for R Markdown | An introduction to browser-based interactivity of htmlwidgets – no Shiny server required! |
Sep 6, 2020 | Column Names as Contracts | Using controlled dictionaries for low-touch documentation, validation, and usability of tabular data |
Jul 26, 2020 | A beginner’s guide to Shiny modules | Don’t believe the documentation! Shiny modules aren’t just for advanced users; they might just be a great entry point for development |
Jul 6, 2020 | projmgr: Managing the human dependencies of your project | A lightning talk on key features of the projmgr package which brings enables code-based planning and reporting workflows grounded in GitHub issues and milestones |
Jul 3, 2020 | Resource Round-Up: Latent and Lasting Documentation | Readings and assorted ideas about creating and maintaining low-overhead documentation |
Jun 30, 2020 | RMarkdown CSS Selector Tips | A few tips and tools for finding the right selectors to style in RMarkdown |
May 14, 2020 | projmgr: Managing the human dependencies of your projects | A walkthrough of using the projmgr package for GitHub-based project management via R |
Feb 1, 2020 | RMarkdown Driven Development: the Technical Appendix | A recommended tech stack for implementing RMarkdown Driven Development |
Jan 30, 2020 | RMarkdown Driven Development | How and why to refactor one time analyses in RMarkdown into sustainable data products |
Aug 30, 2019 | Resource Round-Up: R in Industry Edition | Case studies of the impact of R use on organizational culture and collaboration |
Aug 30, 2019 | Resource Round-Up: Reproducible Research Edition | An annotated bibliography of advice for getting started with reproducible research |
May 25, 2019 | Rtistic: A package-by-numbers repo | A walkthrough of a GitHub template for making your own RMarkdown and ggplot2 theme package |
May 7, 2019 | Notes on supporting conference speakers | Conference planning tips to design a good speakers experience |
May 4, 2019 | RMarkdown Driven Development (RmdDD) | A workflow for refactoring one-time analyses to sustainable data products |
Apr 20, 2019 | Notes on preparing a tech talk | A proposed workflow for methodically developing a good presentations |
Nov 1, 2017 | Assorted talks on designing analytical tools and communities for enterprise | A variety of related talks to creating innersource culture with R packages and related tools |
Nov 1, 2017 | tidycf: Turning analysis on its head by turning cashflows on their side | A case study on building an internal R package for customer lifetime value modeling at Capital One and leading broader analyst adoption of open-source tooling and reproducible workflows through a community of practice. |
Từ khóa » Cc By Sa Nc 4.0
-
Attribution-NonCommercial-ShareAlike 4.0 International — CC BY ...
-
Creative Commons Attribution Non Commercial Share Alike 4.0 ...
-
Creative Commons | The Synergist
-
DeepConv-DTI/CC-BY-NC-SA-4.0 At Master - GitHub
-
Creative Commons Attribution-NonCommercial ... - TLDRLegal
-
Creative Commons Attribution-NonCommercial (CC BY-NC) 4.0
-
CC-BY-NC-SA: Attribution-NonCommercial-ShareAlike 4.0 ...
-
Creative Commons Attribution-NonCommercial-ShareAlike 4.0 ...
-
Creative Commons NonCommercial License - Wikipedia
-
Template:CC-by-nc-sa-4.0 - GeoGebra Manual
-
CC BY-NC-SA 4.0 - CTAN
-
[PDF] Which License Should I Choose? | Vancouver Foundation
-
Quick Guide To Creative Commons - Smartcopying
-
Licencing - Copyright Guide - Singapore Institute Of Technology