A script to convert Wordpress XML dump to markdown files。
And then run the following command:
Use
But you could specify different directory structure and file naming pattern using
Each exported file has a straightforward structure intended for further processing with public-static website generator. It has an INI-like formatted header followed by markdown-formatted post (or page) contents:
Wordpress to Markdown Exporter
A python script to convert Wordpress XML dump to a set of plain text/markdown files. Intended to be used for migration from Wordpress to public-static website generator, but could also be helpful as general purpose Wordpress content processor.Installation
The script could be installed by command:pip install -e git+git://github.com/dreikanter/wp2md#egg=wp2md
It will install wp2md and the following dependencies:Usage
Export Wordpress data to XML file (Tools → Export → All content):And then run the following command:
wp2md -d /export/path/ wordpress-dump.xml
Where /export/path/
is the directory where post and page files will be generated, and wordpress-dump.xml
is the XML file exported by Wordpress.Use
--help
parameter to see the complete list of command line options:usage: wp2md [options] source
Export Wordpress XML dump to markdown files
positional arguments:
source source XML dump exported from Wordpress
optional arguments:
-h, --help show this help message and exit
-v verbose logging
-l FILE log to file
-d PATH destination path for generated files
-u FMT <pubDate> date/time parsing format
-o FMT <wp:post_date> and <wp:post_date_gmt> parsing format
-f FMT date/time fields format for exported data
-p FMT date prefix format for generated files
-m preprocess content with Markdown (helpful for MD input)
-n LEN post name (slug) length limit for file naming
-r generate reference links instead of inline
-ps PATH post files path (see docs for variable names)
-pg PATH page files path
-dr PATH draft files path
-url keep absolute URLs in hrefs and image srcs
-b URL base URL to subtract from hrefs (default is the root)
The output
The script generates a separate file for each post, page and draft, and groups it by configurable directory structure. By default posts are grouped by year-named directories and pages are just stored to the output folder.But you could specify different directory structure and file naming pattern using
-ps
, -pg
and -dr
parameters for posts, pages and drafts respectively. For example -ps {year}/{month}/{day}/{title}.md
will produce date-based subfolders for blog posts.Each exported file has a straightforward structure intended for further processing with public-static website generator. It has an INI-like formatted header followed by markdown-formatted post (or page) contents:
title: Я.Субботник в Санкт-Петербурге, 3 декабря
link: http://paradigm.ru/yandex-subbotni
creator: admin
description:
post_id: 635
post_date: 2011-11-23 22:10:35
post_date_gmt: 2011-11-23 19:10:35
comment_status: open
post_name: yandex-subbotnik
status: publish
post_type: post
# Я.Субботник в Санкт-Петербурге, 3 декабря
Я.Субботник в Санкт-Петербурге пройдет 3 декабря в [офисе Яндекса](http://company.yandex.ru/contacts/spb/).
...
If the post contains comments, they will be included below.See also
- How to export Wordpress data
- How to export Wordpress.com data